fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-19 07:08:05 +02:00

Author	SHA1	Message	Date
Marek Olšák	c1237256cb	ac/nir/tess: execute the tess level workgroup vote on all chips It will be used to skip stores for discarded patches. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>	2025-06-07 16:29:39 +00:00
Marek Olšák	9c16228359	ac/nir/tess: write TCS per-vertex outputs to memory as vec4 stores at the end This improves write throughput for TCS outputs. It follows the same idea as attribute stores in hw GS. The improvement is easily measurable with a microbenchmark. It also has the advantage that multiple output stores to the same address don't result in multiple memory stores. Each output components gets only one memory store at the end of the shader. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>	2025-06-07 16:29:39 +00:00
Marek Olšák	509f0e62ad	ac/nir/tess: allow passing explicit patch_offset to VMEM/LDS offset calculations Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>	2025-06-07 16:29:39 +00:00
Marek Olšák	a59464b6e3	radv,radeonsi: precompute and pass TCS per-vertex output stride via a user SGPR It's a stride of 1 output, which isn't 16. It's 16 * num_threads, aligned to 256. tcs_offchip_layout has 5 unused bits, so let's use them. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>	2025-06-07 16:29:39 +00:00
Marek Olšák	742227c65c	radv,radeonsi: make TCS_OFFCHIP_LAYOUT_NUM_PATCHES not off by one We never use 128 anyway. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>	2025-06-07 16:29:39 +00:00
Marek Olšák	8d3e3c72e0	radv,radeonsi: merge PATCH_CONTROL_POINT & OUT_PATCH_CP into 1 field One is only used by TCS, the other is only used by TES. Use the same field for both, call it PATCH_VERTICES_IN. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>	2025-06-07 16:29:39 +00:00
Marek Olšák	534b282573	ac/nir/tess: adjust memory layout of TCS outputs to have aligned store offsets There is a comment that explains it. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>	2025-06-07 16:29:38 +00:00
Marek Olšák	80236f2367	ac/nir/tess: add if/endif for HS threads in NIR instead of ACO/LLVM This just removes the if/endif wrapping for LLVM, and hopefully the ACO change does the same thing. ACO had redundant code in endif_merged_wave_info, which is removed here. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>	2025-06-07 16:29:38 +00:00
Marek Olšák	cd366b57d9	ac/nir: implement load_subgroup_id/local_invocation_index for TCS on gfx6-10.x Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>	2025-06-07 16:29:38 +00:00
Rhys Perry	86ccceb4de	aco: don't consider gfx1153 to have point sample acceleration Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>	2025-06-06 11:55:13 +01:00
Rhys Perry	f10b49781d	aco: make all wait entries linear If we remove exec skips, then we can wait for an entry on all paths in the linear cfg, but not the logical cfg. fossil-db (gfx1201): Totals from 0 (0.00% of 79653) affected shaders: fossil-db (navi31): Totals from 0 (0.00% of 79653) affected shaders: fossil-db (navi21): Totals from 1586 (1.99% of 79653) affected shaders: Instrs: 5118897 -> 5113206 (-0.11%); split: -0.11%, +0.00% CodeSize: 28365852 -> 28343696 (-0.08%); split: -0.08%, +0.00% Latency: 47820341 -> 47799532 (-0.04%); split: -0.09%, +0.05% InvThroughput: 9904391 -> 9908653 (+0.04%); split: -0.02%, +0.06% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>	2025-06-06 11:55:13 +01:00
Rhys Perry	1088ac49db	aco: sometimes join linear wait entries on logical edges fossil-db (gfx1201): Totals from 1303 (1.64% of 79653) affected shaders: Instrs: 6920949 -> 6917692 (-0.05%); split: -0.06%, +0.01% CodeSize: 37112404 -> 37095728 (-0.04%); split: -0.05%, +0.01% Latency: 70471343 -> 70365986 (-0.15%); split: -0.15%, +0.00% InvThroughput: 11515673 -> 11504666 (-0.10%); split: -0.10%, +0.01% fossil-db (navi31): Totals from 1293 (1.62% of 79653) affected shaders: Instrs: 6500186 -> 6496761 (-0.05%); split: -0.06%, +0.01% CodeSize: 34562712 -> 34549236 (-0.04%); split: -0.04%, +0.01% Latency: 68604746 -> 68666532 (+0.09%); split: -0.15%, +0.24% InvThroughput: 11276591 -> 11284914 (+0.07%); split: -0.10%, +0.17% fossil-db (navi21): Totals from 811 (1.02% of 79653) affected shaders: Instrs: 4110953 -> 4108788 (-0.05%); split: -0.05%, +0.00% CodeSize: 22955984 -> 22948064 (-0.03%); split: -0.03%, +0.00% Latency: 35070231 -> 35064448 (-0.02%); split: -0.02%, +0.00% InvThroughput: 6945610 -> 6945053 (-0.01%); split: -0.01%, +0.00% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>	2025-06-06 11:51:08 +01:00
Rhys Perry	c1f8537131	aco: skip waitcnt between two vmem writing different lanes fossil-db (gfx1201): Totals from 1382 (1.74% of 79653) affected shaders: Instrs: 6531704 -> 6523935 (-0.12%); split: -0.12%, +0.00% CodeSize: 34992076 -> 34933568 (-0.17%); split: -0.17%, +0.01% Latency: 70183360 -> 69616066 (-0.81%); split: -0.81%, +0.00% InvThroughput: 11155445 -> 11068667 (-0.78%); split: -0.78%, +0.00% fossil-db (navi31): Totals from 46 (0.06% of 79653) affected shaders: Instrs: 1833768 -> 1833732 (-0.00%) CodeSize: 9468788 -> 9468716 (-0.00%) Latency: 11683092 -> 11667865 (-0.13%) InvThroughput: 2274377 -> 2272872 (-0.07%) fossil-db (navi21): Totals from 0 (0.00% of 79653) affected shaders: Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>	2025-06-06 11:51:08 +01:00
Rhys Perry	9649deb50e	aco: skip waitcnt between two vmem writing different halves fossil-db (gfx1201): Totals from 4 (0.01% of 79653) affected shaders: Instrs: 41374 -> 41380 (+0.01%); split: -0.01%, +0.02% CodeSize: 238912 -> 238924 (+0.01%); split: -0.01%, +0.01% Latency: 706714 -> 706410 (-0.04%) InvThroughput: 352269 -> 352118 (-0.04%) VClause: 803 -> 798 (-0.62%) fossil-db (navi31): Totals from 0 (0.00% of 79653) affected shaders: fossil-db (navi21): Totals from 0 (0.00% of 79653) affected shaders: Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13028 Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>	2025-06-06 11:51:08 +01:00
Rhys Perry	9a38ad3ca7	aco: add wait_entry::logical_events Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>	2025-06-06 11:51:08 +01:00
Rhys Perry	bb99de00f7	aco: add wait_entry::vm_mask Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>	2025-06-06 11:51:08 +01:00
Rhys Perry	b70ecfa588	aco: only join barrier_imm/barrier_events for logical edges fossil-db (gfx1201): Totals from 3 (0.00% of 79653) affected shaders: Instrs: 2904 -> 2893 (-0.38%) CodeSize: 14944 -> 14900 (-0.29%) Latency: 14703 -> 14248 (-3.09%) InvThroughput: 1237 -> 1210 (-2.18%) fossil-db (navi31): Totals from 3 (0.00% of 79653) affected shaders: Instrs: 2742 -> 2731 (-0.40%) CodeSize: 14136 -> 14092 (-0.31%) Latency: 14744 -> 14287 (-3.10%) InvThroughput: 1241 -> 1213 (-2.26%) fossil-db (navi21): Totals from 3 (0.00% of 79653) affected shaders: Instrs: 2326 -> 2315 (-0.47%) CodeSize: 12472 -> 12428 (-0.35%) Latency: 14921 -> 14465 (-3.06%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>	2025-06-06 11:51:08 +01:00
Rhys Perry	62a9b4b976	aco: set vmem_types for args_pending_vmem fossil-db (gfx1201): Totals from 0 (0.00% of 79653) affected shaders: fossil-db (navi31): Totals from 11 (0.01% of 79653) affected shaders: Instrs: 4543 -> 4554 (+0.24%) CodeSize: 23256 -> 23300 (+0.19%) fossil-db (navi21): Totals from 8 (0.01% of 79653) affected shaders: Instrs: 2333 -> 2341 (+0.34%) CodeSize: 12328 -> 12360 (+0.26%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Backport-to: 25.0 Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>	2025-06-06 11:51:08 +01:00
Samuel Pitoiset	babeb975c4	radv,radeonsi: fix emitting UPDATE_DB_SUMMARIZER_TIMEOUT on GFX12 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Not all PFP firmwares for GFX12 have this packet. Fixes: `47f5d25f93` ("radv,radeonsi: emit UPDATE_DB_SUMMARIZER_TIMEOUT on GFX12") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13312 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35370>	2025-06-05 16:51:07 +00:00
Rhys Perry	00a2ed60f8	radv/meta: use unsigned min in copy/fill shaders Otherwise, this would break >2 GiB copy/fill. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Backport: 25.1 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35343>	2025-06-05 09:55:32 +00:00
Georg Lehmann	297fdc6636	radv: don't accidentally expose samplerFilterMinmax through Vulkan 1.2 Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35339>	2025-06-05 09:01:19 +00:00
Marek Olšák	c3034fa82c	amd: replace most u_bit_consecutive* with BITFIELD_MASK/RANGE Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35346>	2025-06-04 17:46:38 +00:00
David Rosca	e579b982b0	radv/video: Set all pic params for H264 encode refs Fixes encoding B-frames with I-frame as L1 reference. Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35283>	2025-06-04 11:33:02 +00:00
David Rosca	92e99e6169	radv/video: Add radv_enc_h264/5_pic_type Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35283>	2025-06-04 11:33:02 +00:00
Samuel Pitoiset	098c15bfc9	radv: use paired shader registers for graphics on GFX12 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Loosely based on RadeonSI. This is supposed to be faster because parsing the packet header seems to be the main bottleneck on GFX12. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35282>	2025-06-04 09:17:51 +00:00
Samuel Pitoiset	c8b3c92a3e	radv: add macros for paired shader registers on GFX12 Imported from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35282>	2025-06-04 09:17:51 +00:00
Samuel Pitoiset	c8f9e0fb05	radv: add a new dirty state for emitting tess user SGPRs Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35282>	2025-06-04 09:17:51 +00:00
Georg Lehmann	c27cdaac70	radv: expose scalarBlockLayout on GFX6 Scalar block layout doesn't allow anything that our memory load/store vectorizer couldn't create on its own. So I assume whatever reason there was to only expose this feature on GFX7+ was incorrect or ended up being fixed. Passes vkcts in CI on tahiti. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35279>	2025-06-04 08:49:57 +00:00
Karol Herbst	4f5ce2d5aa	ac/nir: fix unaligned single component load/stores This fixes two problems: 1. we need to lower the bit_size according to the alignment. 2. num_components could end up being 0, so we need to round up instead. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13102 Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34976>	2025-06-03 13:14:31 +00:00
Samuel Pitoiset	94a4ba5b4d	radv/ci: bump the timeout for radv-polaris10-vkcts Looks like it's actually also affected by the memory explosion caused by zerovram alloc by default in AMDGPU. Though it's very random, sometimes the job will finish in 40 minutes, sometimes it needs more than 1h15m. Let's bump the timeout because it's a post-merge job. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35157>	2025-06-03 10:18:30 +00:00
Rhys Perry	2e82f481ca	radv: fix too large shift exponent in radv_remove_color_exports "shift exponent 1020 is too large for 32-bit type 'unsigned int'" with madmax/25b8180e05220b8c and UBSan Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35255>	2025-06-03 09:45:01 +00:00
Valentine Burley	3a0cc0ee0d	ci: Use zstd compressed kernel modules Change how we package kernel modules: instead of storing them in .tar.zst archives with uncompressed .ko files inside, we now compress each .ko file individually with ZSTD and bundle them into a plain tar archive. Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35129>	2025-06-03 07:27:26 +00:00
Georg Lehmann	a6675f35b2	aco: clamp exponent of 16bit ldexp The hw uses only a 16bit int, but NIR's src is 32bit. Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34073>	2025-06-03 06:34:18 +00:00
Natalie Vock	dac6f09451	radv/rt: Report 256 byte alignment for scratch This mirrors AMDVLK. 128-byte alignment is possible, but DOOM: The Dark Ages screws up scratch allocation with alignments <256 bytes. Fixes hangs in DOOM: The Dark Ages. Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35152>	2025-06-02 19:52:51 +00:00
Natalie Vock	6628ac8ad9	radv/rt: Avoid encoding infinities in box node coords On Navi33, certain box sorting modes combined with infinity/-infinity in the child AABBs cause image_bvh64_intersect_ray to return garbage node pointers. To avoid this, convert infinity to the maximum representable floating-point value, which will still intersect with any non-inf ray. Fixes consistent hangs in DOOM: The Dark Ages. Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35254>	2025-06-02 19:33:18 +00:00
Rhys Perry	1fdfdbaf92	aco/hard_clauses: simplify and complete get_type() Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This now includes image_msaa_load and the new atomic instructions in GFX12. It also treats point sample accelerated MIMG as either sample or load, like the waitcnt insertion pass. I'm not sure if that's necessary or not, though. No fossil-db changes (gfx1201, gfx1150 and navi31). Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35235>	2025-06-02 10:28:10 +00:00
Rhys Perry	8764ec0230	aco: consider image_msaa_load a sample operation before gfx12 LLVM commit 62dea99a7d7df9daedbb86133f3d46699cd2728d made this instruction a sample for all GFX levels, then with f898161bfa95723954a273a519180e070a5ccd2e it was changed to be GFX12+. Now 34b6285735c999d2fab77b0ff8e5b497d86df3af changed it to be all GFX levels again. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35235>	2025-06-02 10:28:09 +00:00
David Rosca	960f63596f	radv/video: Add VCN5 encode support Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details New with VCN5 is separate reference images support. Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35261>	2025-06-02 09:30:30 +00:00
David Rosca	4a3b3febda	radv/video: Enable decode on VCN5 No differences from VCN4 for tier2. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13118 Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35261>	2025-06-02 09:30:30 +00:00
David Rosca	25f7996395	radv/video: Set correct minCodedExtent for encode Cc: mesa-stable Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35261>	2025-06-02 09:30:30 +00:00
David Rosca	ef305f3875	radv: Use RADEON_SURF_VIDEO_REFERENCE for video DPB images Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35261>	2025-06-02 09:30:30 +00:00
Samuel Pitoiset	47f5d25f93	radv,radeonsi: emit UPDATE_DB_SUMMARIZER_TIMEOUT on GFX12 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This try to mitigate the HiZ GPU hang by increasing a timeout. Loosely based on PAL but I can confirm it delays the hang when BOTTOM_OF_PIPE_TS is used as a workaround. This must be emitted when the GFX queue is idle. Cc: mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35212>	2025-06-02 07:30:18 +00:00
David Rosca	8f4e251c98	radeonsi/vcn: Support disabling HEVC dependent slice segments With older FW this needs to be always enabled, but it can now be disabled when using the new separate header instructions for dependent_slice_segment_flag and slice_segment_address. Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35072>	2025-05-30 08:29:53 +00:00
Samuel Pitoiset	9692ef41a3	aco: implement bitfield_extract for 8-bit/16-bit Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35199>	2025-05-29 12:24:59 +00:00
Samuel Pitoiset	fe2c93a788	ac/nir: enable 64-bit lowering for bitfield_extract Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35187>	2025-05-29 08:45:41 +02:00
Samuel Pitoiset	8596150ae8	aco: implement bitfield_reverse for types other than 32-bits Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34583>	2025-05-28 09:52:12 +00:00
Daniel Schürmann	5b4d284493	aco/isel: use vector-aligned operands for image_bvh64_intersect_ray Totals from 93 (0.12% of 79377) affected shaders: (Navi48) MaxWaves: 1376 -> 1368 (-0.58%) Instrs: 3583500 -> 3581861 (-0.05%); split: -0.05%, +0.00% CodeSize: 18792300 -> 18785296 (-0.04%); split: -0.04%, +0.00% VGPRs: 8652 -> 8592 (-0.69%); split: -1.25%, +0.55% Latency: 20861347 -> 20834407 (-0.13%); split: -0.17%, +0.04% InvThroughput: 4032604 -> 4028020 (-0.11%); split: -0.14%, +0.03% VClause: 90507 -> 90525 (+0.02%); split: -0.01%, +0.03% Copies: 279429 -> 277839 (-0.57%); split: -0.58%, +0.01% Branches: 100260 -> 100251 (-0.01%) PreVGPRs: 8949 -> 8771 (-1.99%) VALU: 1955635 -> 1954053 (-0.08%); split: -0.08%, +0.00% SALU: 477347 -> 477329 (-0.00%); split: -0.01%, +0.01% VOPD: 69 -> 61 (-11.59%) Totals from 93 (0.12% of 79377) affected shaders: (Navi31) MaxWaves: 1376 -> 1374 (-0.15%) Instrs: 3442606 -> 3440344 (-0.07%); split: -0.07%, +0.00% CodeSize: 17801008 -> 17790476 (-0.06%); split: -0.07%, +0.01% VGPRs: 8652 -> 8556 (-1.11%); split: -1.25%, +0.14% Latency: 20590943 -> 20542279 (-0.24%); split: -0.27%, +0.03% InvThroughput: 3978133 -> 3969497 (-0.22%); split: -0.25%, +0.03% VClause: 91784 -> 91769 (-0.02%); split: -0.05%, +0.03% Copies: 277177 -> 275263 (-0.69%); split: -0.70%, +0.01% Branches: 100098 -> 100092 (-0.01%); split: -0.02%, +0.01% PreVGPRs: 9021 -> 8843 (-1.97%) VALU: 2001794 -> 1999893 (-0.09%); split: -0.10%, +0.00% SALU: 419504 -> 419559 (+0.01%); split: -0.01%, +0.02% VOPD: 77 -> 64 (-16.88%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34359>	2025-05-28 09:24:17 +00:00
Rhys Perry	c50f9541e4	aco/tests: Add tests for vector-aligned operands Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34359>	2025-05-28 09:24:17 +00:00
Daniel Schürmann	b5382faa9c	aco/validate: validate register assignment of vector-aligned operands Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34359>	2025-05-28 09:24:17 +00:00
Daniel Schürmann	9091c3bf5b	aco/ra: add affinities for MIMG vector-aligned operands Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34359>	2025-05-28 09:24:17 +00:00

1 2 3 4 5 ...

17711 commits