fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-20 02:38:07 +02:00

Author	SHA1	Message	Date
Samuel Pitoiset	fb43d7bff2	ac/perfcounter: re-order GPU perf blocks on GFX12 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39199>	2026-01-12 08:10:31 +00:00
Samuel Pitoiset	3b6ff80d48	ac/perfcounter: define more GPU blocks on GFX12 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39199>	2026-01-12 08:10:31 +00:00
Samuel Pitoiset	eb37d6ceb7	ac/perfcounter: fix computing number of 16-bit/32-bit SPM counters Determine them only when both are explicitly 0. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39199>	2026-01-12 08:10:30 +00:00
Samuel Pitoiset	d1efdc7e76	ac/perfcounter: fix number of 32-bit SPM counters Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39199>	2026-01-12 08:10:29 +00:00
Samuel Pitoiset	9fe57d3882	ac/spm: define new per-shader engine blocks Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39199>	2026-01-12 08:10:29 +00:00
Samuel Pitoiset	60fac38491	ac/spm: fix typo in one GPU perf block name Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39199>	2026-01-12 08:10:29 +00:00
Samuel Pitoiset	db02077c8a	radv: remove extra instructions after UNREACHABLE Minor cleanups. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39237>	2026-01-12 07:41:08 +00:00
Samuel Pitoiset	e1e2517664	radv: use UNREACHABLE for illegal texture filter Found this with a broken CTS test, way easier to crash for isolating the test case. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39237>	2026-01-12 07:41:08 +00:00
Samuel Pitoiset	91e0f8f1e5	radv/rt: fix a compilation warning about uninitialized fields Just zero-initialize the layout struct to fix the following warning because radv_use_bvh8() might return FALSE. ../src/amd/vulkan/radv_acceleration_structure.c: In function ‘radv_update_as_gfx12’: ../src/amd/vulkan/radv_acceleration_structure.c:873:70: warning: ‘layout.bounds_offsets’ may be used uninitialized [-Wmaybe-uninitialized] 873 \| .bounds = state->build_info->scratchData.deviceAddress + layout.bounds_offsets, \| ~~~~~~^~~~~~~~~~~~~~~ ../src/amd/vulkan/radv_acceleration_structure.c:866:33: note: ‘layout.bounds_offsets’ was declared here 866 \| struct update_scratch_layout layout; \| ^~~~~~ Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39228>	2026-01-12 07:18:50 +00:00
Konstantin Seurer	077292f65b	radv/bvh: Use box16 nodes when bvh8 is not used Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Using box16 nodes trades bvh quality for memory bandwidth which seems to be roughly equal in performance. Stats assuming box16 nodes are as expensive as box32 nodes: Totals from 7668 (79.68% of 9624) affected BVHs: compacted_size: 951666944 -> 742347648 (-22.00%) max_depth: 57606 -> 57615 (+0.02%) sah: 129114796242 -> 129998517775 (+0.68%); split: -0.00%, +0.68% scene_sah: 188564162 -> 192063633 (+1.86%); split: -0.02%, +1.88% box16_node_count: 0 -> 3270600 (+inf%) box32_node_count: 3365707 -> 95100 (-97.17%) Reviewed-by: Natalie Vock <natalie.vock@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37883>	2026-01-10 11:36:28 +01:00
Konstantin Seurer	543a88af99	radv/bvh: Add radv_aabb16 and use it for box16 nodes Reviewed-by: Natalie Vock <natalie.vock@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37883>	2026-01-10 11:36:19 +01:00
Konstantin Seurer	fefdad9249	radv/rra: Count box16 nodes properly Otherwise rra won't allocate memory when loading the capture. Reviewed-by: Natalie Vock <natalie.vock@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37883>	2026-01-10 11:34:18 +01:00
Konstantin Seurer	39d58a55a7	aco: Add support to f2f16 with rtpi/rtni Those rounding modes are needed when computing 16-bit bounding boxes since the bounding box must not get smaller. Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37883>	2026-01-10 11:34:12 +01:00
Alyssa Rosenzweig	235e868ef7	ac/nir: use nir_is_shared_access Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39219>	2026-01-09 20:51:13 +00:00
Benjamin Cheng	499d9e2e98	radv/video: Allow aliasing of video images Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39109>	2026-01-09 13:52:56 +00:00
Georg Lehmann	6d07a56c6a	ac/nir/lower_ps_late: preserve signed zero, inf, nan for exports Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39187>	2026-01-09 11:58:52 +00:00
Georg Lehmann	84ecac58a6	ac/nir/opt_pack_half: preserve fp_math_ctrl Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39187>	2026-01-09 11:58:52 +00:00
Georg Lehmann	5241343ccb	ac/nir/lower_sin_cos: preserve fp_math_ctrl Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39187>	2026-01-09 11:58:52 +00:00
Georg Lehmann	9331726157	ac/nir/lower_sin_cos: use nir_shader_alu_pass Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39187>	2026-01-09 11:58:52 +00:00
Samuel Pitoiset	4fa20bacac	radv/ci: document a regression with transfer queue on RENOIR Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Weird that only RENOIR fails given that ASTC/ETC2 aren't natively supported too. Needs to be investigated but SDMA supports these formats to some extent it seems. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39230>	2026-01-09 10:47:31 +00:00
Samuel Pitoiset	edb730f647	radv: fix flushing gang semaphore with SDMA/ACE Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details If the main CS is SDMA and the gang CS is ACE, this would emit a SDMA_FENCE packet on ACE which just hangs. Fixes: `b1938901d0` ("radv: Use SDMA fence packet when flushing gang semaphores") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39211>	2026-01-09 09:07:45 +00:00
Natalie Vock	60dd9d797e	aco: Swizzle ray launch IDs in the RT prolog This converts from 1D workgroups to 2D ray launch IDs entirely via shader ALU, including handling partial/cut-off workgroups optimally. Doing this entirely in-shader means it Just Works(TM) with indirect dispatches as well. Previous approaches manipulating various things on CPU depending on the dispatch size couldn't handle indirect dispatches. The swizzle implemented here also swizzles with a recursive Z-order pattern, which should be a little more optimal than arranging invocations linearly within the wave. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39142>	2026-01-08 19:49:55 +01:00
Natalie Vock	1f6ac3fa93	radv/rt,aco: Always dispatch 1D workgroups for RT We will swizzle the workgroups ourselves in the next commit. Removes the need for 1D dispatch workarounds. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39142>	2026-01-08 19:49:54 +01:00
Natalie Vock	8baa95e4aa	radv/rt: Use subgroup invocation for stack index Workgroup == subgroup anyway, and we don't have the workgroup thread IDs in RT shaders. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39142>	2026-01-08 19:49:45 +01:00
Georg Lehmann	330e88abb8	amd/drm-shim: add vega20 Vega20 ISA is different enough from Vega10 that having it in drm-shim is useful for testing compiler changes. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39188>	2026-01-08 09:30:54 +00:00
Pierre-Eric Pelloux-Prayer	bfa8dcf3b3	ac/sdma: fix src/dst pitch for sdma < 4 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Fixes DRM_PRIME with AMD_DEBUG=notiling. Fixes: `5f8fa6ae03` ("ac,radv,radeonsi: add ac_emit_sdma_copy_linear_sub_window()") Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39019>	2026-01-08 09:45:58 +01:00
Pierre-Eric Pelloux-Prayer	2f347b5725	ac/sdma: fix ac_sdma_get_tiled_header_dword for older gen The header should be 0 for older sdma as well. This fixes DRI_PRIME support for radeonsi. Fixes: `f5ecc5ffd5` ("ac,radv,radeonsi: add ac_emit_sdma_copy_tiled_sub_window()") Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39019>	2026-01-08 09:45:48 +01:00
Georg Lehmann	a706769a0b	nir: move exact bit to nir_fp_math_control Unifies nir per instruction float control. In the future this can be split into contract/reassoc/transform like SPIR-V. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (except SPIR-V) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39103>	2026-01-07 09:40:57 +00:00
Georg Lehmann	eb4737a1dd	nir: add nir_alu_instr_is_exact helper Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39103>	2026-01-07 09:40:57 +00:00
Marek Olšák	492a176cbb	util: increase SHA1_DIGEST_LENGTH to 32 (BLAKE3_KEY_LEN) The last 12 bytes are always 0 for now. With this, all SHA1 functions can be internally implemented as BLAKE3, so that we can switch everything to BLAKE3 by only changing the implementation of the sha1 utility. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39110>	2026-01-07 08:32:33 +00:00
Marek Olšák	1912a00a91	ALL: use SHA1_DIGEST_LENGTH etc. instead of hardcoding the numbers only build_id is switched to use literal 20 instead of SHA1_DIGEST_LENGTH because we will increase SHA1_DIGEST_LENGTH to BLAKE3_KEY_LEN Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39110>	2026-01-07 08:32:33 +00:00
Samuel Pitoiset	9f5dd888b6	radv/sqtt: add a comment about the allocation strategy of the SQTT BO Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39172>	2026-01-07 06:57:29 +00:00
Samuel Pitoiset	ffa343ed05	Revert "radv: allocate the SQTT BO in GTT for faster readback" This reverts commit `da07f1ef3f`. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14591 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39172>	2026-01-07 06:57:29 +00:00
Marek Olšák	13cfd0176c	ac/gpu_info: add #define AMD_MEMCHANNEL_INTERLEAVE_BYTES radeon_info::pipe_interleave_bytes is renamed to r600_pipe_interleave_bytes where it can be 512 on some chips. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39120>	2026-01-06 20:32:10 +00:00
Marek Olšák	92133bb0ab	amd: demystify various optimizations we already have for memory channels Explain why we do what we do, and use the radeon_info field properly. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39120>	2026-01-06 20:32:10 +00:00
Samuel Pitoiset	4fce09268a	ac/perfcounter: rename ac_pc_block::num_global_instances to num_instances Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39155>	2026-01-06 11:43:21 +00:00
Samuel Pitoiset	59dc20262c	ac/perfcounter: rename ac_pc_block::num_instances to num_scoped_instances Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39155>	2026-01-06 11:43:21 +00:00
Samuel Pitoiset	3658d9588f	ac/perfcounter: add num_{16,32}bit_spm_counters to GPU blocks Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39155>	2026-01-06 11:43:21 +00:00
Samuel Pitoiset	9a1b925400	ac/spm: use GPU block distribution mode to determine instances Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39155>	2026-01-06 11:43:20 +00:00
Samuel Pitoiset	ac0423eb3f	ac/spm: use GPU block distribution mode to determine broadcasting SPM is only implemented for GFX10+. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39155>	2026-01-06 11:43:19 +00:00
Rhys Perry	614437ead5	ac/nir: don't vectorize 16-bit shared loads to 8-bit Fixes an issue where a 2x16 load was vectorized with a 1x16 load into a 8x8 load. This became possible after `49d923078f` increased aligned_new_size from 6 bytes to 8 bytes. fossil-db (navi31): Totals from 5 (0.01% of 79825) affected shaders: Instrs: 6994 -> 6257 (-10.54%) CodeSize: 44000 -> 39464 (-10.31%) Latency: 90482 -> 89795 (-0.76%) InvThroughput: 202955 -> 201926 (-0.51%) VClause: 560 -> 565 (+0.89%) Copies: 1135 -> 1108 (-2.38%); split: -2.82%, +0.44% PreVGPRs: 201 -> 199 (-1.00%) VALU: 3882 -> 3201 (-17.54%) SALU: 493 -> 479 (-2.84%) VOPD: 262 -> 258 (-1.53%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `49d923078f` ("ac/nir: fix calculation of aligned_new_size") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14500 Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39156>	2026-01-06 10:28:02 +00:00
Marek Olšák	3552028e87	ac/lower_ngg_mesh: fix a segfault accessing out_variables out of bounds Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details component_addr_off (in bytes) was used to offset a component index (in dwords). Cc: mesa-stable Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39144>	2026-01-05 22:23:42 +00:00
Marek Olšák	b340588119	ac/gpu_info: don't read uninitialized dev_filename This fixes radeonsi-run-tests.py not being able to read AMD_DEBUG=info. Fixes: `8777894d3e` - amd: remove radeon_info::dev_filename Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39145>	2026-01-05 21:56:36 +00:00
Konstantin Seurer	405c93c665	radv: Optimize BVH4 acceleration structure updates Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details It is more efficient to compute the child index of the current node inside the parent node and write the bounds when available. The previous code could load up to 16 AABBs to compute the new ones. The new code also only needs 1/7 of the previously used scratch memory. The new code seems to be around 30% faster (0.5ms) in GOTG on a 6700XT. Reviewed-by: Natalie Vock <natalie.vock@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39139>	2026-01-05 15:24:54 +00:00
Daniel Schürmann	2d0d5fc104	aco/validate: validate constant bus limit after register allocation based on PhysReg Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39107>	2026-01-05 14:54:00 +00:00
Daniel Schürmann	eb16f701a6	aco/tests: Add new test to pack 2x16 SGPRs into VGPR Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39107>	2026-01-05 14:54:00 +00:00
Daniel Schürmann	61c1ec541d	aco/tests: Add test for subdword extraction from SGPR Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39107>	2026-01-05 14:54:00 +00:00
Daniel Schürmann	0674c9d30e	aco/validate: Validate correct RegisterClasses after lowering to HW instructions Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39107>	2026-01-05 14:53:59 +00:00
Daniel Schürmann	b087bf2fbf	aco/lower_to_hw: Fix SGPR Operand RegClasses for pack_2x16 Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39107>	2026-01-05 14:53:59 +00:00
Daniel Schürmann	9f5996ae8a	aco/lower_to_hw: Don't use 2 SGPR operands before GFX10 in a single VOP3 instruction in do_pack_2x16() Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39107>	2026-01-05 14:53:58 +00:00

1 2 3 4 5 ...

19643 commits