Samuel Pitoiset
ee2400acf1
ac/parse_ib: dump PKT3_DISPATCH_{TASKMESH_GFX,TASKMESH_DIRECT_ACE}
...
Useful for inspecting command buffers.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29821 >
2024-06-25 09:20:48 +00:00
Samuel Pitoiset
4db32ac7ef
radv/amdgpu: use the non-IB path for dumping CS with external IBs
...
Only the first CS chunk was dumped, but this allows to dump CS that
are post the DGC execute IB when on compute queue.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29832 >
2024-06-25 07:24:50 +00:00
Rhys Perry
17f2ebe8d2
aco: use 1.5x vgprs for gfx1151 and gfx12
...
From LLVM.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29839 >
2024-06-24 18:39:40 +00:00
Samuel Pitoiset
030d6e6280
radv/amdgpu: allow cs_execute_ib() to pass a VA instead of a BO
...
DGC IBs are considered external IBs because they aren't managed by
the winsys and the BO itself isn't really useful. Passing a VA instead
will help for future work.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29600 >
2024-06-24 08:02:07 +00:00
Samuel Pitoiset
e51ae61a4d
radv: add the DGC preprocess BO to the cmdbuf BO list
...
This wasn't needed in practice because DGC NV is only enabled for
vkd3d-proton and it always uses the global BO list but better to add it
anyways.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29600 >
2024-06-24 08:02:07 +00:00
Collabora's Gfx CI Team
cdf3228f88
Uprev Piglit to fdf3fc09deb6beecdf212e65a16c645112540b59
...
cf8daaf5ba...fdf3fc09de
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29684 >
2024-06-24 07:10:48 +00:00
Samuel Pitoiset
25bf3200e2
radv: remove useless draw_id to radv_emit_userdata_task()
...
It's always 0 for direct draws.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29830 >
2024-06-24 06:38:43 +00:00
Samuel Pitoiset
d2b1d38392
radv: remove useless masking in radv_cs_emit_indirect_mesh_draw_packet()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29830 >
2024-06-24 06:38:43 +00:00
Samuel Pitoiset
b2ff08800e
radv: remove dead mesh shader code for indirect draws
...
This path is never used by mesh shaders.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29830 >
2024-06-24 06:38:43 +00:00
Samuel Pitoiset
d922a0e875
radv: use radv_shader_info::user_data_0 for task shaders
...
To avoid duplicating the base user SGPR.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29830 >
2024-06-24 06:38:43 +00:00
Samuel Pitoiset
334046648b
radv: cleanup getting AC_UD_TASK_RING_ENTRY for mesh shader
...
The last VGT shader is the mesh shader.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29830 >
2024-06-24 06:38:43 +00:00
Konstantin Seurer
ee751a26fc
radv/rra: Enable RADV_RRA_TRACE_COPY_AFTER_BUILD by default
...
RADV_RRA_TRACE_COPY_AFTER_BUILD is more accurate and the memory issues
are fixed now.
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29537 >
2024-06-21 17:47:53 +00:00
Konstantin Seurer
aa1b9d9be5
radv/rra: Rework calculating the ray history size
...
The previous approach was broken when writing empty metadata.
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29537 >
2024-06-21 17:47:53 +00:00
Konstantin Seurer
090ca37352
radv/rra: Reduce the memory requirement of copy_after_build
...
vkd3d-proton always sets the acceleration structure size to be the
whole buffer size. Because of that, allocating read back buffers
for all acceleration structures causes a system with a finite amount
of RAM to OOM.
This is solved by allocating read back buffers on build where the
required size is known.
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29537 >
2024-06-21 17:47:53 +00:00
Konstantin Seurer
c2c555402b
radv/rra: Bump rt_driver_interface_version to 8.0
...
8.0 matches the layout we emit more closely.
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29537 >
2024-06-21 17:47:53 +00:00
Konstantin Seurer
55f1fe9bc3
radv/rra: Fix reporting the isec invocations
...
Copy+paste mistake, we always set the last call to accept.
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29537 >
2024-06-21 17:47:53 +00:00
Konstantin Seurer
97c0f264f0
radv/rra: Fix disabling the ray history
...
There are a bunch of NULL pointer dereferences that went unnoticed
because the feature is enabled by default.
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29537 >
2024-06-21 17:47:53 +00:00
Konstantin Seurer
bd377cfe89
radv/rra: Move some code into handle_accel_struct_write
...
The code is the same for all callers.
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29537 >
2024-06-21 17:47:53 +00:00
Konstantin Seurer
ea69f7bc89
radv/rra: Detect BVHs with back edges
...
Avoid overflowing the stack and fail validation.
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29537 >
2024-06-21 17:47:53 +00:00
Alyssa Rosenzweig
da752ed7c1
treewide: use nir_def_replace sometimes
...
Two Coccinelle patches here. Didn't catch nearly as much as I would've liked but
it's a start.
Coccinelle patch:
@@
expression intr, repl;
@@
-nir_def_rewrite_uses(&intr->def, repl);
-nir_instr_remove(&intr->instr);
+nir_def_replace(&intr->def, repl);
Coccinelle patch:
@@
identifier intr;
expression instr, repl;
@@
nir_intrinsic_instr *intr = nir_instr_as_intrinsic(instr);
...
-nir_def_rewrite_uses(&intr->def, repl);
-nir_instr_remove(instr);
+nir_def_replace(&intr->def, repl);
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com> [broadcom]
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> [lima]
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com> [etna]
Reviewed-by: Pavel Ondračka <pavel.ondracka@gmail.com> [r300]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29817 >
2024-06-21 15:36:56 +00:00
Konstantin Seurer
23ee6ca801
radv/meta: Use READ access for dst_access_flush
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29780 >
2024-06-21 12:52:39 +00:00
Konstantin Seurer
14f7b077c8
radv: Remove dead access bits
...
READ access bits are dead as radv_src_access_flush arguments and WRITE
access bits are dead as radv_dst_access_flush arguments.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29780 >
2024-06-21 12:52:39 +00:00
Konstantin Seurer
1c59634445
radv: Clean up pipeline barrier handling
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29780 >
2024-06-21 12:52:39 +00:00
Pierre-Eric Pelloux-Prayer
14974fd097
ac/llvm: implement WA in nir to llvm
...
LLVM implements multiple workarounds for gfx11.
The problem is that they're not applied for shaders built in
parts.
LLVM will be modified to be more conservative and apply the
workaround in more places but in the meantime, add a simpler
implementation in the NIR to LLVM backend: insert a wait at
the end of each shader part.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10785
Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29304 >
2024-06-20 13:14:33 +00:00
Rhys Perry
71afacff39
aco/insert_exec_mask: ensure top mask is not a temporary at loop exits
...
This is problematic when the successor of the loop exit is an invert
block. It assumes that the top mask is Operand(bld.lm) and doesn't change
it when entering the else branch.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11348
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29767 >
2024-06-20 12:47:05 +00:00
Georg Lehmann
5c6c8182c8
radv: inline partial push constant loads
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29675 >
2024-06-20 12:09:29 +00:00
Rhys Perry
bdc229231d
aco: remove push constants
...
These are lowered in NIR.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29675 >
2024-06-20 12:09:29 +00:00
Rhys Perry
38d1456931
ac/llvm: remove push constants
...
These are lowered in NIR.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29675 >
2024-06-20 12:09:29 +00:00
Rhys Perry
edbb75ce3a
radv: lower push constants in NIR
...
fossil-db (navi21):
Totals from 879 (1.11% of 79395) affected shaders:
Instrs: 1359371 -> 1360237 (+0.06%); split: -0.02%, +0.08%
CodeSize: 7290856 -> 7294308 (+0.05%); split: -0.01%, +0.06%
SpillSGPRs: 751 -> 800 (+6.52%)
Latency: 21923904 -> 21923983 (+0.00%); split: -0.03%, +0.03%
InvThroughput: 7029748 -> 7029528 (-0.00%); split: -0.03%, +0.03%
VClause: 23595 -> 23610 (+0.06%)
SClause: 31819 -> 32256 (+1.37%); split: -0.07%, +1.44%
Copies: 109175 -> 110089 (+0.84%); split: -0.13%, +0.97%
Branches: 32068 -> 32072 (+0.01%); split: -0.02%, +0.03%
PreSGPRs: 41831 -> 41774 (-0.14%); split: -0.15%, +0.01%
PreVGPRs: 53605 -> 53604 (-0.00%)
VALU: 1020426 -> 1020521 (+0.01%); split: -0.00%, +0.01%
SALU: 135931 -> 136850 (+0.68%); split: -0.08%, +0.76%
SMEM: 51688 -> 51686 (-0.00%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29675 >
2024-06-20 12:09:28 +00:00
Samuel Pitoiset
e8fb4b82e9
radv: fix emitting indirect descriptor sets in the DGC prepare shader
...
NIR_DEBUG=validate_ssa_dominance failed because dgc_cs_emit() weren't
actually in the if.
Fixes: 33a849e004 ("radv: emit indirect sets for indirect compute pipelines with DGC")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29782 >
2024-06-20 06:33:51 +00:00
Dave Airlie
6a464401d5
ac/radv/radeon: move film grain init to common code.
...
Share the film grain code between users.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29747 >
2024-06-19 20:51:53 +00:00
Dave Airlie
57535969cb
ac/radv/radeonsi: move av1 ctx/probs size/filling to common code.
...
All the av1 prob and ctx sizing code can be shared between
radv and radeonsi here.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29747 >
2024-06-19 20:51:52 +00:00
Dave Airlie
f1e27e156b
radv/video: use vcn ip versions for encoder detection.
...
This aligned with radeonsi code.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29747 >
2024-06-19 20:51:52 +00:00
Dave Airlie
b888946f7a
radv/video: fix layered decode h264/5 tests.
...
CTS tests both layered and separate DPB, but radv wasn't handling
layered properly when used with the tier 2 dpb handling.
This adjusts the addresses to use the layer index for tier2.
Fixes dEQP-VK.video.decode.*layered*
Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29758 >
2024-06-19 08:02:31 +00:00
Rhys Perry
9fe3af1e2a
aco: insert s_nop before discard early exit sendmsg(dealloc_vgpr)
...
Forgot about this one.
fossil-db (gfx1100):
Totals from 3920 (2.94% of 133461) affected shaders:
Instrs: 6632088 -> 6636008 (+0.06%)
CodeSize: 34165376 -> 34181056 (+0.05%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Fixes: 37fbfa655a ("aco: insert s_nop before VGPR deallocation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29770 >
2024-06-18 20:17:38 +00:00
Georg Lehmann
c3c398d56d
aco: make local functions static in files without anonymous namespace
...
I don't think adding an anonymous namespace in these files is worth it
given the amount of global functions
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29740 >
2024-06-18 17:53:07 +00:00
Georg Lehmann
046414e061
aco: add more anonymous namespaces
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29740 >
2024-06-18 17:53:07 +00:00
Samuel Pitoiset
33a849e004
radv: emit indirect sets for indirect compute pipelines with DGC
...
This used to work by luck because the current DGC prepare shader
is using one descriptor set and it was the currently bound compute
shader... Using two descriptor sets or starting from 1 would just fail.
For indirect compute pipelines, descriptors must be emitted from the
DGC shader because there is no bound compute pipeline at all. This
solution is using indirect descriptor sets because it's much shorter
and easier to implement. This could be improved but nothing uses
indirect compute pipelines and this is like experimental stuff.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29700 >
2024-06-18 13:50:16 +00:00
Samuel Pitoiset
b1ba02e707
radv: force using indirect descriptor sets for indirect compute pipelines
...
Emitting descriptors in DGC is a huge pain but using indirect descriptor
sets is much easier.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29700 >
2024-06-18 13:50:16 +00:00
Timur Kristóf
0bf10ad4ad
radv: Use number of TES inputs for TCS-TES linking.
...
This is to match what ac_nir_lower_tess_io_to_mem also does.
Doesn't address any known bug, but it's theoretically possible
that TCS outputs_written and TES inputs_read mismatch, so let's
be on the safe side here.
Fixes: be49b02f05
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29696 >
2024-06-18 12:06:22 +00:00
Timur Kristóf
0355364743
ac/nir/tess: Fix per-patch output VRAM mapping.
...
VARYING_SLOT_PATCH0 is greater than 64 so it is wrong to use it
with BITFIELD64_BIT. Check for VARYING_SLOT_TESS_LEVEL_* properly
when mapping output locations in VRAM.
Fixes: 2cf7f282df
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11253
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29696 >
2024-06-18 12:06:21 +00:00
Timur Kristóf
0f0ebd8512
ac/nir/tess: Fix per-patch output LDS mapping.
...
VARYING_SLOT_PATCH0 is greater than 64 so it is wrong to use it
with BITFIELD64_BIT. Check for VARYING_SLOT_TESS_LEVEL_* properly
when mapping output locations in LDS.
Fixes: c61eb54806
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29696 >
2024-06-18 12:06:21 +00:00
Timur Kristóf
348b8859dc
ac/nir/tess: Only write tess factors that the TES reads.
...
Otherwise we would write to a memory location reserved
for another per-patch output.
Fixes: 2cf7f282df
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11324
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29696 >
2024-06-18 12:06:21 +00:00
Alyssa Rosenzweig
15257b65c6
treewide: use nir_metadata_control_flow
...
Via Coccinelle patch:
@@
@@
-nir_metadata_block_index | nir_metadata_dominance
+nir_metadata_control_flow
...plus some manual fixups for call sites missed by coccinelle.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Juan A. Suarez Romero <jasuarez@igalia.com> [broadcom]
Acked-by: Vasily Khoruzhick <anarsoul@gmail.com> [lima]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29745 >
2024-06-17 16:28:14 -04:00
Daniel Schürmann
7af16e9f1e
nir/shader_info: remove uses_demote
...
This flag is mostly redundant with uses_discard and was only
introduced to implement demote with LLVM when it didn't have
that intrinsic.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27617 >
2024-06-17 19:37:16 +00:00
Daniel Schürmann
9b1a748b5e
nir: remove nir_intrinsic_discard
...
The semantics of discard differ between GLSL and HLSL and
their various implementations. Subsequently, numerous application
bugs occurred and SPV_EXT_demote_to_helper_invocation was written
in order to clarify the behavior. In NIR, we now have 3 different
intrinsics for 2 things, and while demote and terminate have clear
semantics, discard still doesn't and can mean either of the two.
This patch entirely removes nir_intrinsic_discard and
nir_intrinsic_discard_if and replaces all occurences either with
nir_intrinsic_terminate{_if} or nir_intrinsic_demote{_if} in the
case that the NIR option 'discard_is_demote' is being set.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27617 >
2024-06-17 19:37:16 +00:00
Daniel Schürmann
f3d8bd18dd
nir: introduce discard_is_demote compiler option
...
This new option indicates that the driver emits the same
code for nir_intrinsic_discard and nir_intrinsic_demote.
Otherwise, it is assumed that discard is implemented as
terminate.
spirv_to_nir uses this option in order to directly emit
nir_demote in case of OpKill.
RADV GFX11:
Totals from 3965 (4.99% of 79439) affected shaders:
MaxWaves: 119418 -> 119424 (+0.01%); split: +0.03%, -0.03%
Instrs: 1608753 -> 1620830 (+0.75%); split: -0.18%, +0.93%
CodeSize: 8759152 -> 8785152 (+0.30%); split: -0.18%, +0.48%
VGPRs: 152292 -> 149232 (-2.01%); split: -2.37%, +0.36%
Latency: 9162314 -> 10033923 (+9.51%); split: -0.46%, +9.97%
InvThroughput: 1491656 -> 1493408 (+0.12%); split: -0.10%, +0.22%
VClause: 21424 -> 21452 (+0.13%); split: -0.31%, +0.44%
SClause: 53598 -> 55871 (+4.24%); split: -2.15%, +6.39%
Copies: 90553 -> 90462 (-0.10%); split: -2.91%, +2.81%
Branches: 16283 -> 16311 (+0.17%)
PreSGPRs: 113993 -> 113254 (-0.65%); split: -1.84%, +1.19%
PreVGPRs: 110951 -> 108914 (-1.84%); split: -2.08%, +0.24%
VALU: 963192 -> 963167 (-0.00%); split: -0.01%, +0.01%
SALU: 87926 -> 90795 (+3.26%); split: -2.92%, +6.18%
VMEM: 25937 -> 25936 (-0.00%)
SMEM: 110012 -> 109799 (-0.19%); split: -0.20%, +0.01%
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27617 >
2024-06-17 19:37:15 +00:00
Daniel Schürmann
d5821bdf7d
radv: emit discard as demote by default
...
Also removes radv_lower_discard_to_demote debug option.
Totals from 1506 (1.90% of 79439) affected shaders: (GFX11)
MaxWaves: 46432 -> 46448 (+0.03%)
Instrs: 664515 -> 667914 (+0.51%); split: -0.15%, +0.67%
CodeSize: 3569656 -> 3583440 (+0.39%); split: -0.12%, +0.51%
VGPRs: 50100 -> 49680 (-0.84%); split: -0.96%, +0.12%
Latency: 4221359 -> 4217875 (-0.08%); split: -0.67%, +0.59%
InvThroughput: 628809 -> 625565 (-0.52%); split: -0.53%, +0.02%
VClause: 9948 -> 9965 (+0.17%); split: -0.36%, +0.53%
SClause: 19656 -> 19695 (+0.20%); split: -0.77%, +0.97%
Copies: 32113 -> 33513 (+4.36%); split: -1.59%, +5.95%
Branches: 8406 -> 8378 (-0.33%)
PreSGPRs: 42328 -> 42555 (+0.54%); split: -0.39%, +0.93%
PreVGPRs: 38451 -> 38203 (-0.64%); split: -0.78%, +0.14%
VALU: 390770 -> 390208 (-0.14%); split: -0.16%, +0.02%
SALU: 43318 -> 46374 (+7.05%); split: -0.08%, +7.14%
VMEM: 15052 -> 15051 (-0.01%)
SMEM: 37225 -> 37215 (-0.03%); split: -0.03%, +0.01%
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27617 >
2024-06-17 19:37:15 +00:00
Samuel Pitoiset
bd59478d2f
radv: implement streamout on GFX12
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29676 >
2024-06-17 14:46:36 +00:00
Samuel Pitoiset
aa9dfcad50
radv/nir: lower nir_intrinsic_load_xfb_state_address_gfx12_amd
...
This intrinsic returns a 64-bit address that points to the streamout
state buffer.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29676 >
2024-06-17 14:46:36 +00:00