Commit graph

214295 commits

Author SHA1 Message Date
Georg Lehmann
7ac67e2711 aco/isel: emit vop2 v_max_f64 for gfx12+
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38156>
2025-10-31 08:31:03 +00:00
Georg Lehmann
8397b91934 aco/isel: emit vop2 v_min_f64 for gfx12+
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38156>
2025-10-31 08:31:02 +00:00
Georg Lehmann
2e120d4e26 aco/isel: emit vop2 v_mul_f64 for gfx12+
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38156>
2025-10-31 08:31:01 +00:00
Georg Lehmann
86ea462f4d aco/isel: emit vop2 v_fadd_f64 for gfx12+
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38156>
2025-10-31 08:31:01 +00:00
Georg Lehmann
7d2325b194 aco/lower_to_hw: emit vop2 for gfx12+ fp64 reductions
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38156>
2025-10-31 08:31:00 +00:00
Samuel Pitoiset
968fb06a94 radv,vulkan: replace VK_RENDERING_INPUT_ATTACHMENT_NO_CONCURRENT_WRITES_BIT_MESA
The new flag from maintenance10 has similar meaning.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38043>
2025-10-31 07:51:23 +00:00
Samuel Pitoiset
c8aaf3f5b5 radv: advertise VK_KHR_maintenance10
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38043>
2025-10-31 07:51:22 +00:00
Samuel Pitoiset
14639898d0 radv: add support for controlling sRGB transfer function with resolves
Just need to use UNORM image views.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38043>
2025-10-31 07:51:22 +00:00
Samuel Pitoiset
0034f5a948 radv: allow ds<->color copies on compute/transfer queues
This should be supported just fine.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38043>
2025-10-31 07:51:22 +00:00
Samuel Pitoiset
49128926d6 radv: implement new input attachment information for dynamic rendering
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38043>
2025-10-31 07:51:21 +00:00
Samuel Pitoiset
18fec61c8d radv: reverse the logic for NO_CONCURRENT_WRITES_BITS_MESA
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38043>
2025-10-31 07:51:21 +00:00
Samuel Pitoiset
d3924f5bd6 radv: add support for depth/stencil resolves with vkCmdResolve2()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38043>
2025-10-31 07:51:20 +00:00
Samuel Pitoiset
8306256e2a radv: allow NULL pSamplesMask with vkCmdSetSampleMaskEXT()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38043>
2025-10-31 07:51:19 +00:00
Samuel Pitoiset
d5d2a4ad07 radv: implement vkCmdEndRendering2KHR()
Common runtime code already does CmdEndRendering()->CmdEndRendering2().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38043>
2025-10-31 07:51:19 +00:00
Paul Gofman
63aec75981 driconf: add a workaround for Investigation Stories : gunsound
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
CC: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38152>
2025-10-31 07:27:45 +00:00
Aitor Camacho
3f7ee1fef8 kk: Use our own driverID value
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
LOLed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38175>
2025-10-31 03:07:45 +00:00
Marek Olšák
9def0a6e5b ac/nir: set support_indirect_inputs/outputs in common code
This fixes mesh shader performance of RADV for GravityMark by stopping
the lowering of ClipDistance[64][4] indirect access for mesh shader outputs.

The perf improvement is 14% on Navi48.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38155>
2025-10-31 00:57:46 +00:00
Marek Olšák
86dd74aaeb nir/lower_indirect_derefs: don't lower compact arrays unconditionally to fix perf
This fixes bad mesh shader performance. See the comment.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38155>
2025-10-31 00:57:46 +00:00
Christian Gmeiner
cb6cb2697e etnaviv: isa: Add norm_mul instruction
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Blob generates such norm_mul for glmark2:shadow benchmark on STM32MP257.

Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38172>
2025-10-31 00:34:57 +01:00
Dylan Baker
79f4eca2f0 anv: Fix potential overflow from doing 32bit math on 64bit types
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
By using a 64bit unsigned to iterate instead of a plain unsigned

CID: 1646981
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35243>
2025-10-30 21:36:58 +00:00
Dylan Baker
a6102f7432 intel/mda: Use a vector to track the contents variable
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
I'm not really sure why Coverity doesn't tag the `delete[]` as a
potential leak since it also happens after ASSERT macros, like it did
with the call to `fclose()`.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37744>
2025-10-30 20:03:24 +00:00
Dylan Baker
9509d0d8b8 intel/mda: Use GTEST fixtures to manage File handles
Coverity points out that if the asserts fail, then the file won't be
closed, and therefore wont be deleted. We'd like to avoid littering the
temp directory with useless files.

This uses GTEST's `TEST_F` feature with a custom class to manager the
creation and destruction of the tmpfile.

CID: 1666502
CID: 1666525
CID: 1666579
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37744>
2025-10-30 20:03:24 +00:00
Daniel Schürmann
b3615e5d6f nir/algebraic: ad-hoc constant-fold ALU instructions
Slight differences due to different optimization order.
Totals from 135 (0.17% of 79839) affected shaders: (Navi48)
Instrs: 287852 -> 287527 (-0.11%); split: -0.15%, +0.03%
CodeSize: 1522972 -> 1521764 (-0.08%); split: -0.12%, +0.04%
Latency: 1806803 -> 1825754 (+1.05%); split: -0.08%, +1.12%
InvThroughput: 242693 -> 244703 (+0.83%); split: -0.02%, +0.84%
VClause: 4092 -> 4084 (-0.20%)
SClause: 7462 -> 7478 (+0.21%)
Copies: 20509 -> 20401 (-0.53%); split: -0.74%, +0.21%
Branches: 6395 -> 6386 (-0.14%)
PreSGPRs: 7334 -> 7337 (+0.04%); split: -0.03%, +0.07%
PreVGPRs: 6375 -> 6382 (+0.11%)
VALU: 151787 -> 151595 (-0.13%); split: -0.15%, +0.02%
SALU: 52967 -> 52910 (-0.11%); split: -0.23%, +0.12%
VMEM: 6704 -> 6696 (-0.12%)
SMEM: 12099 -> 12129 (+0.25%)

Tested on a small collection of 2518 shaders from Dredge with callgrind using RADV:
baseline:
  nir_opt_algebraic was called 12917 times from radv_optimize_nir()
  nir_opt_cse was called 15204 times from radv_optimize_nir()
  relative time spent in radv_optimize_nir(): 31.48%
  total instruction fetch cost: 28,642,638,021

with nir/algebraic: ad-hoc constant-fold ALU instructions
  nir_opt_algebraic was called 12797 times from radv_optimize_nir()
  nir_opt_cse was called 12963 times from radv_optimize_nir()
  relative time spent in radv_optimize_nir(): 30.63%
  total instruction fetch cost: 28,284,386,123

=> ~1.27% improvement in total compile times

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37195>
2025-10-30 19:28:07 +00:00
Daniel Schürmann
10be538851 tree-wide: don't call nir_opt_constant_folding after nir_lower_flrp
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37195>
2025-10-30 19:28:07 +00:00
Daniel Schürmann
9039e24751 nir/lower_flrp: ad-hoc constant-fold ALU instructions
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37195>
2025-10-30 19:28:07 +00:00
Daniel Schürmann
f61cd64af8 nir/builder: add option to immediately constant-fold ALU instructions upon insertion
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37195>
2025-10-30 19:28:07 +00:00
Daniel Schürmann
280eb2d689 vulkan/nir: call nir_opt_constant_folding() during vk_spirv_to_nir()
This prevents bugged CTS tests from tripping over with the following commits.

dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp32.generated_args.denorm_sstep_denorm_flush_to_zero
dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp32.generated_args.denorm_sstep_denorm_flush_to_zero_*

These tests exhibit undefined values where the result depends on the ordering
of nir_opt_algebraic and nir_opt_constant_folding.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37195>
2025-10-30 19:28:07 +00:00
Daniel Schürmann
870616af34 nir/constant_folding: switch to nir_shader_lower_instructions()
Small differences due to implicit DCE.
Totals from 76 (0.10% of 79839) affected shaders: (Navi48)

Instrs: 168051 -> 168044 (-0.00%); split: -0.01%, +0.01%
CodeSize: 893284 -> 893256 (-0.00%); split: -0.01%, +0.01%
Latency: 1082007 -> 1082027 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 155100 -> 155105 (+0.00%)
Copies: 9649 -> 9654 (+0.05%)
VALU: 92504 -> 92509 (+0.01%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37195>
2025-10-30 19:28:07 +00:00
Daniel Schürmann
d1f2f1222e nir: guard nir_def_as_alu()
We will potentially create load_const_instr instead of ALU.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37195>
2025-10-30 19:28:06 +00:00
Daniel Schürmann
3180656bbc nir: don't use nir_build_alu() with incomplete sources
Ideally we'd have a version that takes nir_scalar arguments.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37195>
2025-10-30 19:28:06 +00:00
Daniel Schürmann
ef9ecc4058 nir: add nir_imul_nuw() and nir_imul_imm_nuw() helpers
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37195>
2025-10-30 19:28:06 +00:00
Sushma Venkatesh Reddy
36d7cd0514 drirc: Add anv_assume_full_subgroups for Detroit: Become Human
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Add workaround to assume full 32-thread subgroups. This fixes rendering
corruption when running the Detriot: Become Human game using anv driver.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14120

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38168>
2025-10-30 18:27:34 +00:00
Caio Oliveira
3334284845 brw: Don't set destination of branch instructions
In Gfx9+ the destination should be set to ARF null in all those cases, the
use of IP was a requirement of old versions only.  The already zeroed
bits will encode ARF null, so no need to set.

Skipping the helper avoids setting unwanted bits (like hstride), which
in Gfx12+ are MBZ.

This patch adjust the expectations of the asm tests to remove the dst
type and dst stride fields -- will expect them all zeroed.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36454>
2025-10-30 17:18:15 +00:00
Caio Oliveira
8c45ff9acb brw: Set relevant immediate bits for Gfx9-11 in JIP and UIP helpers
This is better than using the generic helper since will not set unwanted
bits (e.g. hstride) and it is already handling their case for Gfx12+
anyway.

There's an extra helper now for the case where src1 is not used.  In
Gfx9-11 it needs to be set to ARF but with a matching type of src0.

Assembler was updated to follow the same approach.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36454>
2025-10-30 17:18:15 +00:00
Caio Oliveira
adc353da3c brw: Fix MOV_INDIRECT lowering for various platforms
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Even though some platforms support int64 they don't support indirect
movs with 64-bit values.  Effectively this is only supported for non-LP
Gfx9.

This fixes various tests in dEQP-VK.spirv_assembly.instruction.compute.untyped_pointers.*.push_constant.*64*
on BMG.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38125>
2025-10-30 16:06:42 +00:00
Caio Oliveira
538fd7266e brw: Fix EU validation of VxH and Vx1 region
Use same approach as the other code checking for this vstride.  Argument
could be made we want to reuse the same enum value for both the encoded
and decoded version, but for now follow the existing practice.

This will cause
dEQP-VK.spirv_assembly.instruction.compute.untyped_pointers.vulkan_memory_model.type_punning.load.push_constant.int64_to_uint64
and similar tests to fail validation on BMG.  Later patch will fix that.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38125>
2025-10-30 16:06:42 +00:00
Rob Clark
15924e941c loader: Ignore empty override strings
If you somehow have MESA_LOADER_DRIVER_OVERRIDE= in your environment,
you certainly weren't trying to force load the driver named "".

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38165>
2025-10-30 15:44:40 +00:00
Serdar Kocdemir
7bc14977a8 gfxstream: Check host allocation mode for external memory
New host feature allows usage of external_memory_host extension
for external memory support, mainly for software renderers which
don't implement platform specific extensions.

Also moves enablement of queue_family_foreign outside of android,
to allow it also on linux guests (ref: github PR#74), and removes
the check for VK_MVK_moltenvk extension in favor of metal mode.

Test: -gpu lavapipe -feature VulkanNativeSwapchain on windows

Reviewed-by: David Gilhooley <djgilhooley.gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38153>
2025-10-30 15:27:26 +00:00
Jason Macnak
deb48556dc gfxstream: codegen changes for new filenames and namespaces
Bug: n/a
Test: build + CI

Reviewed-by: David Gilhooley <djgilhooley.gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38153>
2025-10-30 15:27:26 +00:00
Gurchetan Singh
a85e9b8d8a gfxstream: fix build after VK 1.4.33.0 spec update
The root cause is "Add initial Vulkan Base support", which refactors
Vulkan versions into smaller features (base + graphics + compute).
Not sure if this is the cleanest solution, but someone needs to
refactor gfxstream codegen anyways.

There's also can issue with VkRenderingArea.

Fixes: 61c71733c8 ("vulkan: update spec to 1.4.330")
TEST=libgfxstream_vulkan.so + gfxstream_backend.so (host) commpiles
     with new changes

Reviewed-by: David Gilhooley <djgilhooley.gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38153>
2025-10-30 15:27:26 +00:00
Faith Ekstrand
ba2645bc6d nvk: Enable ASTC on Tegra
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38163>
2025-10-30 15:11:25 +00:00
Faith Ekstrand
f7412bd229 nvk: Add an NVK_DEBUG=coherent flag
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38163>
2025-10-30 15:11:24 +00:00
Faith Ekstrand
5a5862c025 nvk: Document some environment variables
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38163>
2025-10-30 15:11:23 +00:00
Faith Ekstrand
2f6b3b6b91 nvk: Don't re-initialize the descriptor writer if the set matches
The logic here before was wrong.  In the case where the set is the same,
it would avoid the flush but then re-initialize anyway, loosing the
dirty information and causing us not to actually flush out all the
descriptors.

Fixes: 1f0fda22f7 ("nvk: Flush descriptor set maps")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38163>
2025-10-30 15:11:23 +00:00
Mike Blumenkrantz
bae22fec7d zink: return mesh pipeline when creating mesh pipelines
Fixes: b05d93e71e ("zink: set gfx_pipeline_state::mesh_pipeline when updating pipeline")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38161>
2025-10-30 14:38:26 +00:00
Rob Clark
532a896f80 freedreno/a6xx: Additional handle import logging
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Add additional error logging which can be enabled with
FD_MESA_DEBUG=layout to make it easier to debug handle
import failures.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37144>
2025-10-30 14:13:24 +00:00
Juan A. Suarez Romero
9d6bac7004 broadcom/ci: unlock more CI-Tron jobs
They will be run in parallel with baremetal ones, and after ensuring it
works fine, baremetal will be disabled.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38159>
2025-10-30 13:29:16 +00:00
Lorenzo Rossi
fa2c8c9d39 nak: Fix delay insertion missing WaR
Current code clears register writes after a register read is
encountered, this handles the first WaR but hides the write from the
reads that will succeed the first one. Ignoring subtle WaRaR hazards.
To fix this, we don't clear writes when a register read is encountered.

Thanks to Karol Herbst for finding and distilling this issue.

Signed-off-by: Lorenzo Rossi <git@rossilorenzo.dev>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37108>
2025-10-30 12:49:12 +00:00
Lorenzo Rossi
7c39f69871 nak: Add cross-block instruction delay scheduling
Currently each block schedules instruction independently from other
blocks. Instructions in the block must then be scheduled conservatively
to remove possible hazards that can occur in previous blocks.

Replace the algorithm with an optimistic data-flow pass that takes into
account all the following blocks. Gains a minor performance improvement
across every shader (1-2%) and should never have any runtime performance
degradation.

Benchmarks:
furmark       34932 -> 35597 (+2%)
pixmark-piano 9027  ->  9113 (+1%)

Signed-off-by: Lorenzo Rossi <git@rossilorenzo.dev>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37108>
2025-10-30 12:49:12 +00:00
Lorenzo Rossi
e6d4eaed2a nak/reg_tracker: Add SparseRegTracker
Signed-off-by: Lorenzo Rossi <git@rossilorenzo.dev>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37108>
2025-10-30 12:49:12 +00:00