Commit graph

17254 commits

Author SHA1 Message Date
Timothy Arceri
d8782db3a4 glsl: fix regression in ubo cloning
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Fixes KHR-GL46.layout_binding.block_layout_binding_block_VertexShader
with radeonsi.

Fixes: 2b2132d2ac ("nir: fix uniform cloning helper")

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34337>
2025-04-06 19:43:47 +10:00
Valentine Burley
b411310b12 radv/ci: Update ANGLE version used for traces
The updated ANGLE version fixed the fog rendering in the minetest trace.

The PIGLIT_REPLAY_ANGLE_ARCH was also changed in the new artifact to
match ANGLE's own naming.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7916

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34308>
2025-04-05 09:13:54 +02:00
Valentine Burley
1668feefb4 ci: Make it possible to use ANGLE traces on other architectures
Don't hardcode amd64 architecture, use PIGLIT_REPLAY_ANGLE_ARCH to make
it easier to opt in for ANGLE traces on arm64 in the future.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34308>
2025-04-05 09:13:53 +02:00
Eric Engestrom
e65d0d2250 radv/ci: update expectations
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34386>
2025-04-04 23:49:23 +00:00
Eric Engestrom
6a86683ef8 radeonsi/ci: update expectations
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34386>
2025-04-04 23:49:23 +00:00
Georg Lehmann
c70dcd1451 aco/gfx9+: use d16 global/scratch/buffer loads
Full register loads are not nessecary and prevent packing optimizations.
Global/Scratch is GFX9+ so D16 loads are always supported.
We already used LDS D16 loads.

Foz-DB Navi31(mostly RA noise):
Totals from 716 (0.90% of 79789) affected shaders:
Instrs: 3854176 -> 3854238 (+0.00%); split: -0.00%, +0.00%
CodeSize: 20034440 -> 20035220 (+0.00%); split: -0.00%, +0.00%
Latency: 24410951 -> 24411120 (+0.00%)
InvThroughput: 5181276 -> 5181301 (+0.00%)
Copies: 320258 -> 320317 (+0.02%)
VALU: 2207307 -> 2207366 (+0.00%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34346>
2025-04-04 16:20:39 +00:00
MaciejDziuban
f31a33905a radv: Use vk_video_derive_h265_scaling_list
This commit makes radv use vk_video_derive_h265_scaling_list, which properly
applies default scaling lists whenever they're needed. It also simplifies
update_h265_scaling function into a simple memcpy. The firmware interface
struct and Vulkan's StdVideoH265ScalingLists struct both have identical memory
layouts, so it's not neccessary divide it into multiple copies with offsets.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34096>
2025-04-04 07:23:48 +00:00
Timur Kristóf
e258492a8f radv: Remove radv_streamout_info::num_outputs.
This field was never used for determining the number of outputs,
just for determining whether streamout was enabled, which makes
it unnecessary. We can use enabled_stream_buffers_mask for that.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34317>
2025-04-03 19:54:51 +00:00
Timur Kristóf
ce2138d73a radv: Call nir_opt_undef too after nir_opt_varyings.
Shaders may have undefined output stores after nir_opt_varyings.
These must be optimized out, otherwise they hit an assertion.

Fixes: 17f6ab28cc
Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34317>
2025-04-03 19:54:51 +00:00
Timur Kristóf
15d0804670 radv: Use buffers_written mask when gathering XFB info.
We need to enable these buffers regardless of whether or not the
shader actually writes any outputs to them, otherwise we break
XFB queries.

Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34317>
2025-04-03 19:54:51 +00:00
Stéphane Cerveau
ee535aa039 radv: video: rework maxActiveReferenceSlot/MaxDpbSlots
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
For the pReferenceSlots.slotIndex, the max
value should the maxDpbSlots which is
h264: 16 + 1
h265 : 15 + 2
av1: 7+2

Fixing SVA_CL1_E test vector in JVT-AVC_V1
fluster test suite.

Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33094>
2025-04-03 13:20:45 +00:00
Eric Engestrom
6331441e24 ci: rename ci-tron priority tag to avoid conflict with the generic fdo runners
Otherwise, ci-tron runners with that tag could pick up jobs meant for the fdo
runners, as happened here:
https://gitlab.freedesktop.org/mesa/mesa/-/jobs/73883719

The inverse (fdo runners picking up a job meant for a ci-tron runner) is not
possible though, as ci-tron jobs always include a `farm:$RUNNER_FARM_LOCATION`
tag, so the problem only exists in the other direction.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34358>
2025-04-03 11:25:12 +00:00
Samuel Pitoiset
ef3363ef71 radv: rework suspend/resume user conditional rendering
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Better to suspend/resume in the top level function.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34338>
2025-04-03 08:54:36 +00:00
Samuel Pitoiset
4bc971a0bd radv: add new helper to suspend/resume user conditional rendering
Instead of duplicating same code everywhere.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34338>
2025-04-03 08:54:36 +00:00
Samuel Pitoiset
4d1d6d4147 radv: fix ignoring conditional rendering with vkCmdResolveImage()
This command isn't supposed to be affected by conditional rendering.

This fixes new VKCTS coverage
dEQP-VK.conditional_rendering.conditional_ignore.resolve_image*.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34338>
2025-04-03 08:54:36 +00:00
David Rosca
597f13b244 radv: Add radv_format_description to remap 10/12bit formats to 16bit
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Remapping was missing for format description which made these formats
effectively unsupported as zero format features were reported.

Fixes: 0098f8ef35 ("radv: Remap 10 and 12 bit formats to 16 bit formats")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34274>
2025-04-02 08:40:28 +00:00
David Rosca
3ef0ee2241 radv: Use radv_format_to_pipe_format instead of vk_format_to_pipe_format
Fixes: 9af11bf306 ("radv: add initial DCC support on GFX12")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34274>
2025-04-02 08:40:28 +00:00
Samuel Pitoiset
64e6e043b3 Revert "radv: program SAMPLE_MASK_TRACKER_WATERMARK optimally for GFX11 APUs"
This reverts commit 96e9c3fe77.

This actually causes random GPU hangs like on Phoenix.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12461
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12426
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12692
Tested-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34306>
2025-04-02 07:10:40 +00:00
Samuel Pitoiset
fac44c0ca0 ac/surface: fix selecting preferred alignments for HiZ/HiS on GFX12
VK_MESA_image_alignment_control is used by vkd3d-proton to set
optimal alignments for images. Though, the preferred alignment was
only applied to the surface (or the stencil aspect) but not to the HiZ
surface due to the NULL check.

This caused rendering issues because swizzle modes didn't match.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12831
Fixes: 079f55d405 ("radv: advertise VK_MESA_image_alignment_control on GFX12")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34322>
2025-04-02 06:47:59 +00:00
Autumn Ashton
ae6d24c4ef radv: Expose VK_SAMPLE_COUNT_1_BIT for sample position on GFX10+
This works on GFX10+.

Signed-off-by: Autumn Ashton <misyl@froggi.es>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28237>
2025-04-01 21:15:34 +01:00
Autumn Ashton
693e3b47f7 radv: Expose EXT_sample_locations everywhere
This works and passes CTS now!

Signed-off-by: Autumn Ashton <misyl@froggi.es>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28237>
2025-04-01 21:15:31 +01:00
Autumn Ashton
343c434c50 radv: Enable fragmentShadingRateWithCustomSampleLocations
We need to expose this, as we support it.

Otherwise 1x1 is assumed and we fail some CTS.

Signed-off-by: Autumn Ashton <misyl@froggi.es>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28237>
2025-04-01 21:15:28 +01:00
Autumn Ashton
3d75082c02 radv: Fix compute resolve rounding
When we are using compute resolve, we can get
values the CTS does not expect due to the value
we end up writing for UNORM in
`nir_image_deref_store`.

Make the compute resolve rounding path match with
the output of the fragment shader resolve path,
by going through the same FP16 RTZ conversion as
we do for UNORM/SNORM formats.

This is why VK_EXT_sample_locations CTS was
failing on > GFX9.
On <= GFX9, I am assuming we are falling back to
RESOLVE_FRAGMENT, due to DCC stuff, which is why
it works there.

I tested a handful of images from the Vulkan CTS
for the sample locations and resolve tests for
diff UNORM formats from the qpa file forcing
FRAGMENT and with this change.
With this change, we now match on the compute
resolve path the same sha for the ones I compared
with ImageMagick `identify`.

CTS passes for: *resolve*, *image_clearing* and
*sample_locations* on RX 7900XTX.

Signed-off-by: Autumn Ashton <misyl@froggi.es>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28237>
2025-04-01 21:15:24 +01:00
Marek Olšák
ce716d009f ac/nir/cull: cull small prims using a point-triangle intersection test
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This is based on Timur Kristof's code, but there are a lot of differences.
The idea is that it doesn't just compute an intersection between a point
and a triangle. It computes the *distance* between a point and a triangle
and it does so in screen space. It accurately takes the subpixel precision
of the rasterizer into account, so that it works optimally at all
resolutions, all MSAA modes, and all quant modes.

The distance computation is only approximated because it only considers
the infinite lines going through triangle edges. However, it seems to be
more than sufficient in practice because the existing rounding-based small
prim culling compensates for it.

The performance improvement is up to 10% in some geometry-bound tests,
though targeted microbenchmarks can show a lot more than that.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33361>
2025-04-01 16:12:22 +00:00
Collabora's Gfx CI Team
1ce0cef6bf Uprev ANGLE to 1b34d2a18af12cc55a3bc74dd679c2937d10cc5c
6abdc11741...1b34d2a18a

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34277>
2025-04-01 12:51:06 +00:00
Daniel Stone
a9f87ff0bd ci/amd: Disable radv-fossils
This job is currently broken due to the lack of git in the Vulkan
container; it should really be pulling the fossils from S3 like the
traces anyway.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34280>
2025-04-01 12:21:00 +00:00
Marek Olšák
5e02621a8a amd/addrlib: remove the DCC page fault workaround
It doesn't cause page faults anymore.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34099>
2025-04-01 03:23:22 -04:00
Marek Olšák
f0e6d86f4e amd: update addrlib
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34099>
2025-04-01 03:23:22 -04:00
Samuel Pitoiset
902c76b3be radv/ci: remove all skips for STONEY
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Seems fine too.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34301>
2025-04-01 06:58:50 +00:00
Samuel Pitoiset
cb1144145a radv/ci: stop skipping one memory test due to timeouts
It seems fine now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34301>
2025-04-01 06:58:50 +00:00
Samuel Pitoiset
71b49aecdc radv: switch back radeon_cmdbuf to use 32-bit counters
This has been tested again with vkoverhead on 4 different CPUs and
using 32-bit counters is the fastest combination overall.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34229>
2025-04-01 06:18:28 +00:00
Samuel Pitoiset
f0b3a6f9d4 radv: rework command buffer emission with begin/end sequences
A begin/end sequence is something like (it's all macros based):

   radeon_begin(cs);
   radeon_emit(PKT3(PKT3_DRAW_INDEX_AUTO, 1, cmd_buffer->state.predicating));
   radeon_emit(vertex_count);
   radeon_emit(V_0287F0_DI_SRC_SEL_AUTO_INDEX | use_opaque);
   radeon_end();

This is loosely based on RadeonSI (see !8653 (a0978fff)) and it seems
indeed faster overall.

The main goal of this rework is to re-use the same logic as RadeonSI
for paired packets on GFX12 (also GFX11 dGPUs) because it's supposed
to be way faster, especially on GFX12 where the CP is slow. The other
goal is to share more cmdbuf emission between both drivers in the near
future.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34229>
2025-04-01 06:18:28 +00:00
Pierre-Eric Pelloux-Prayer
785df1b980 ac/nir: fix nir_metadata value of ac_nir_lower_image_opcodes
This pass can insert new blocks so 'nir_metadata_control_flow' is not
preserved.

Fixes: eaf98b1422 ("ac/nir: implement image opcode emulation for CDNA, enable it in radeonsi")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34241>
2025-03-31 15:19:29 +02:00
Samuel Pitoiset
97e8872f1c radv: only enable HTILE for depth/stencil attachment images
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
It's really only useful for depth/stencil attachments. vkd3d and DXVK
both always use that usage flag for depth/stencil images.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34231>
2025-03-31 11:55:02 +00:00
Samuel Pitoiset
ba9988d230 radv: remove useless use of radv_image_use_comp_to_single()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34231>
2025-03-31 11:55:02 +00:00
Samuel Pitoiset
5398ec6356 radv: add queue family assertions when doing decompression passes
This is to make sure the previous functions that are supposed to
trigger a decompression pass work as expected.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34231>
2025-03-31 11:55:02 +00:00
Samuel Pitoiset
086f529bbe radv: do not trigger FCE or FMASK decompress on compute queue
A pipeline barrier which contains an image layout transition like
COLOR_ATTACHMENT_OPTIMAL -> TRANSFER_DST_OPTIMAL on compute queue
would just hang. Such a barrier is useless in practice but it's legal.

Prevent GPU hangs by skipping FCE or FMASK_DECOMPRESS when it's not
on the graphics queue.

Fixes dEQP-VK.synchronization2.layout_transition.compute_transition*.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34231>
2025-03-31 11:55:02 +00:00
Georg Lehmann
de45676efd aco/insert_exec: reset exec temporary after combined p_demote + p_end_wqm
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Otherwise the next divergent merge block might re-enable demoted invocations.

Fixes: 90faadae72 ("aco/insert_exec_mask: don't disable dead quads on demote in divergent CF")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12898
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12912
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34278>
2025-03-31 06:43:22 +00:00
David Rosca
f9d7d131a4 ac/parse_ib: Parse VCN DYNAMIC_REFLIST_BUFFER
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34262>
2025-03-29 08:50:49 +00:00
David Rosca
5275a88174 ac/parse_ib: Fix parsing output format on VCN5
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34262>
2025-03-29 08:50:49 +00:00
Timur Kristóf
64c6930bfc ac/nir/ngg: Remove cleanup_culling_shader_after_dce.
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Not needed anymore, now that the new concept is there.

No Fossil DB changes on Navi 21.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22073>
2025-03-29 00:47:20 +00:00
Timur Kristóf
243a80be44 ac/nir/ngg: Use deferred info for compacted arguments.
This means we don't have to emit dead code anymore and can only
repack the sysvals that are actually used by the deferred part.

No Fossil DB changes on Navi 21.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22073>
2025-03-29 00:47:20 +00:00
Timur Kristóf
0b71293358 ac/nir/ngg: Gather info about what the deferred shader part uses.
Now that the deferred shader part is prepared before emitting
the non-deferred part, we can also gather info about what sysvals
it needs.

No Fossil DB changes on Navi 21.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22073>
2025-03-29 00:47:20 +00:00
Timur Kristóf
e4c91c01e3 ac/nir/ngg: Prepare deferred shader part before adding culling code.
The previous concept was to emit the non-deferred shader part
first, including the culling code, and then modify the
non-deferred part accordingly.

This caused some issues because it was really impossible to tell
which sysvals the deferred part needs after DCE, so we had to
run an additional cleanup pass afterwards.

The new concept is to prepare the deferred part first by applying
reusable variables (from the non-deferred part) and run DCE.
This opens the possibility to accurately gather info about what
the deferred part needs.

This idea is further expanded in the next commits.

Fossil DB stats on Navi 21:

Totals from 17 (0.02% of 79377) affected shaders:
Instrs: 18063 -> 18064 (+0.01%)
CodeSize: 93368 -> 93372 (+0.00%)
Latency: 49889 -> 49899 (+0.02%); split: -0.01%, +0.03%
SALU: 2416 -> 2417 (+0.04%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22073>
2025-03-29 00:47:20 +00:00
Timur Kristóf
e9e58fa412 ac/nir/ngg: Remove inputs_needed_by_*
This information will be collected by NIR core better,
no need to do it here. It is also currently unused.

No functional changes.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22073>
2025-03-29 00:47:20 +00:00
Timur Kristóf
1e7d28a82e ac/nir/ngg: Improve reuse of position value.
Instead of hand-rolled code, use nir_scalar and its
helper functions to reuse the position value.
Results in more copies, which are mitigated by
copy prop from the previous commit.

This helps eliminate some instructions, especially VMEM loads
from the deferred shader part of NGG culling shaders, which
can be reused from the position values calculated by the
non-deferred part.

Fossil DB stats on Navi 21:

Totals from 2472 (3.11% of 79377) affected shaders:
MaxWaves: 78748 -> 78772 (+0.03%)
Instrs: 636342 -> 633739 (-0.41%); split: -0.45%, +0.04%
CodeSize: 3444740 -> 3427172 (-0.51%); split: -0.53%, +0.02%
VGPRs: 62552 -> 62176 (-0.60%)
Latency: 2025711 -> 2019449 (-0.31%); split: -0.73%, +0.42%
InvThroughput: 221140 -> 221946 (+0.36%); split: -0.12%, +0.49%
VClause: 5443 -> 5278 (-3.03%); split: -3.20%, +0.17%
SClause: 8369 -> 8302 (-0.80%); split: -0.82%, +0.02%
Copies: 102435 -> 101652 (-0.76%); split: -0.87%, +0.11%
PreSGPRs: 63714 -> 63533 (-0.28%)
PreVGPRs: 48555 -> 48392 (-0.34%)
VALU: 242165 -> 241457 (-0.29%); split: -0.33%, +0.04%
SALU: 197656 -> 197482 (-0.09%); split: -0.10%, +0.01%
VMEM: 7746 -> 7571 (-2.26%)
SMEM: 10822 -> 10730 (-0.85%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22073>
2025-03-29 00:47:20 +00:00
Timur Kristóf
f7a160d501 ac/nir/ngg: Run copy propagation.
Helps eliminate needless copies caused by reusing variables.
Mitigates negative stats from the next commit.

Fossil DB stats on Navi 21:

Totals from 109 (0.14% of 79377) affected shaders:
Instrs: 124480 -> 124486 (+0.00%); split: -0.00%, +0.01%
CodeSize: 651444 -> 651468 (+0.00%); split: -0.00%, +0.00%
Latency: 754120 -> 754116 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 174384 -> 174383 (-0.00%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22073>
2025-03-29 00:47:20 +00:00
Natalie Vock
c1e1d86bd1 radv/rt: Flush CP writes from the common BVH framework with INV_L2 on GFX12
a1b05991 ("radv/rt: Flush L2 after writing internal node offset on GFX12")
did this for radv-internal CP writes - we also need to do this for PLOC
sync data initialization which is done in the common framework.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34178>
2025-03-28 23:07:17 +00:00
Dave Airlie
dc8e21ce60 radv: expose VK_KHR_video_mainteance2
Reviewed-by: Lynne <dev@lynne.ee>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34204>
2025-03-28 21:18:00 +00:00
Dave Airlie
feef12b2a8 radv/video: convert to using common parameter wrappers.
Reviewed-by: Lynne <dev@lynne.ee>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34204>
2025-03-28 21:18:00 +00:00