Samuel Pitoiset
fdc212bd7b
aco: create a new builder variant for ds_add_rtn
...
This instruction can use 1 definition and 3 operands.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19317 >
2022-10-31 13:48:39 +00:00
Bas Nieuwenhuizen
5d04064642
radv: Handle attribute ring intrinsic correctly with LLVM.
...
Again, if we don't set progress to false we get fun stuff.
Fixes: 8bf1aa1b76 ("radv: add lowering for nir_intrinsic_load_ring_attr_{offset}_amd")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19404 >
2022-10-31 13:17:04 +00:00
Bas Nieuwenhuizen
45ff58cfd1
radv: Handle GSVS ring intrinsic correctly with LLVM.
...
If we don't set progress to false we get a mess as a replacement is
still attempted.
Fixes: 382831c986 ("radv,nir: add intrinsics for streamout and GS copy shaders")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19404 >
2022-10-31 13:17:04 +00:00
Bas Nieuwenhuizen
ec9d71498e
radv: Use correct types for loading the rings with LLVM.
...
Ring descriptors are v4i32, not i8.
Fixes: cb117cdc96 ("radv/llvm: use ac_build_gep0_type to get args types")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19404 >
2022-10-31 13:17:04 +00:00
Samuel Pitoiset
d4ec3f21cf
Revert "radv: add a pointer to radv_shader_binary in radv_shader"
...
This is actually not necessary because we compile and upload binaries
directly from libraries with GPL. This introduced random double free
crashes because binaries were potentially freed by concurrent threads.
Root cause found by Ishi.
This reverts commit f8d887527a .
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19383 >
2022-10-31 12:16:38 +00:00
Georg Lehmann
2eac571d61
aco: Use opsel for the third operand.
...
Foz-DB Navi21:
Totals from 2 (0.00% of 134913) affected shaders:
CodeSize: 7788 -> 7772 (-0.21%)
Instrs: 1305 -> 1303 (-0.15%)
Latency: 7175 -> 7163 (-0.17%)
InvThroughput: 2082 -> 2078 (-0.19%)
Copies: 57 -> 55 (-3.51%)
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19380 >
2022-10-31 09:54:01 +00:00
Samuel Pitoiset
25e311e9d3
radv: implement transform feedback queries with NGG streamout
...
The control bit is written to the upper bits because GDS counters
are 32-bits only, this allows to re-use the existing query shader.
Tested on GFX10.3.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19325 >
2022-10-31 08:22:29 +00:00
Bas Nieuwenhuizen
78519987b9
radv: Speculatively tune RT pipelines for GFX11.
...
With ACO not supporting VOPD and the high number of SALU instructions,
we're likely better off using wave64 until we can actually benchmark
this and fix these issues.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19288 >
2022-10-31 02:39:34 +00:00
Bas Nieuwenhuizen
9369b40725
radv: Use PLOC for BVH building
...
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19292 >
2022-10-30 19:48:46 +00:00
Bas Nieuwenhuizen
271865373e
radv: Add PLOC shader
...
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19292 >
2022-10-30 19:48:46 +00:00
Friedrich Vock
14dfb6035f
radv: Add REF as a typename macro to .clang-format
...
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19292 >
2022-10-30 19:48:46 +00:00
Friedrich Vock
0c0f179037
radv: Add global sync utilities
...
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19292 >
2022-10-30 19:48:46 +00:00
Friedrich Vock
608fa1bd25
radv/rt: Track number of inactive leaf nodes
...
To avoid emitting nodes with only invalid children in PLOC.
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19292 >
2022-10-30 19:48:46 +00:00
Friedrich Vock
f502b3aab3
radv/rt: Dispatch internal converter indirectly
...
Preparation for using the converter with PLOC.
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19292 >
2022-10-30 19:48:46 +00:00
Friedrich Vock
49c0995918
radv/rt: Fix internal converter synchronization
...
Fixes: e83e4faf ("radv: Only emit parents from parents that actually end up in the tree.")
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19292 >
2022-10-30 19:48:46 +00:00
Friedrich Vock
fa578f280e
radv: Add radv_indirect_unaligned_dispatch
...
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19292 >
2022-10-30 19:48:46 +00:00
Friedrich Vock
030a1f6843
radv: Use a struct for AABBs
...
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19292 >
2022-10-30 19:48:46 +00:00
Bas Nieuwenhuizen
ccf0a69e05
radv: Make the number of internal nodes be written on the GPU.
...
Opens the door of algorithms with a variable number of nodes.
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19292 >
2022-10-30 19:48:46 +00:00
Bas Nieuwenhuizen
0e23df959e
radv: Add BVH IR header.
...
To include GPU state passed between stages but not in a node.
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19292 >
2022-10-30 19:48:46 +00:00
Friedrich Vock
37525c11d1
radv: Rename emulated float helpers
...
Use only conversion functions now.
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19292 >
2022-10-30 19:48:46 +00:00
Marek Olšák
0ac37b595a
nir: add nir_intrinsic_optimization_barrier_vgpr_amd for LLVM
...
We need this for the MSAA resolve shader.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19243 >
2022-10-29 18:38:33 +00:00
Rhys Perry
7fa50ced14
aco: insert waitcnt before/after ds_ordered_count
...
The LLVM backend does this when lowering ordered_xfb_counter_add_amd. I
guess there is some missing dependency checking or something.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19345 >
2022-10-28 21:50:05 +00:00
Rhys Perry
ea8ddf5c26
aco: add storage_gds
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19345 >
2022-10-28 21:50:05 +00:00
Samuel Pitoiset
eae2867122
radv: move nir_opt_idiv_const/nir_lower_idiv after NGG lowering
...
NGG streamout lowering creates some idiv instructions that need to be
lowered.
No fossil-db results because it's currently broken.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19364 >
2022-10-28 18:19:57 +00:00
Samuel Pitoiset
e2fcbd4a37
radv/llvm: fix dual source blending on GFX11
...
Untested but this should be similar to RadeonSI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19367 >
2022-10-28 17:03:37 +00:00
Samuel Pitoiset
d172fc1fca
radv: fix VRS limit when attachmentFragmentShadingRate is disabled
...
Can be reproduced on GFX10.3 with RADV_DEBUG=nohiz.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19374 >
2022-10-28 16:30:29 +00:00
Samuel Pitoiset
c41997f29f
radv: fix suspending/resuming pipeline statistics queries with GDS
...
This probably doesn't fix anything in practice because GDS is only
used for the number of generated primitives by GS and meta operations
don't use GS.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19348 >
2022-10-28 08:12:30 +00:00
Samuel Pitoiset
cf687e88ce
ac/nir/ngg: fix emitting streamout output by using packed location
...
In RadeonSI, they are packed but not in RADV, so don't rely on driver
locations.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19365 >
2022-10-28 07:49:32 +00:00
Alyssa Rosenzweig
941c37c085
nir/lower_idiv: Remove imprecise_32bit_lowering
...
NIR has two implementations of lower_idiv, keyed on the
imprecise_32bit_lowering flag. This flag is misleading: the results when
setting this flag "imprecise", they're completely wrong for some values.
If a backend has a native implementation of umul_high, the correct path
isn't that much more expensive. If it doesn't, it's substantially slower
for highp integer divison... but in practice, non-constant highp integer
division is pretty rare.
After a painful migration of the tree, this code path has no more users.
Remove it so nobody else gets the bright idea of using it again.
Closes : #6555
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19303 >
2022-10-27 19:37:14 +00:00
Rhys Perry
93fb84237f
ac/nir: add ac_nir_lower_ngg_options
...
These signatures were getting ridiculous.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19340 >
2022-10-27 13:31:40 +00:00
Rhys Perry
21a319851a
ac/nir: micro-optimize boolean expression
...
Ignoring SCC spilling, the old version is probably faster because this
mixes uniform and divergent booleans.
fossil-db (navi21):
Totals from 61167 (45.10% of 135636) affected shaders:
Instrs: 29961899 -> 29932551 (-0.10%)
CodeSize: 157407028 -> 157289636 (-0.07%)
Latency: 139671953 -> 139625186 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 21221097 -> 21220756 (-0.00%)
SClause: 750438 -> 750439 (+0.00%)
Copies: 2672846 -> 2582332 (-3.39%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19340 >
2022-10-27 13:31:40 +00:00
Daniel Schürmann
c80137fcba
radv/rt: overwrite hit args with undef in case of a miss
...
This helps some variable coalescing.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19188 >
2022-10-27 09:45:39 +00:00
Daniel Schürmann
f4270b7659
radv/rt: create traversal shader independent from main shader
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19188 >
2022-10-27 09:45:39 +00:00
Iago Toral Quiroga
9deef4cde6
vulkan/runtime: include robustness info when hashing a shader stage
...
Suggested by Jason Ekstrand.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18883 >
2022-10-27 08:17:11 +00:00
Qiang Yu
bfb6a5fef1
ac/nir/ngg: add one odd dword to nogs culling pervertex lds
...
radeonsi use like this.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18832 >
2022-10-27 07:35:01 +00:00
Qiang Yu
13fb7f8f2c
ac/nir/ngg,ac/llvm,aco: save nogs ngg culling one lds dword
...
TES rel patch id is <256, so we can use an existing unused LDS
byte instead of extra dword.
To ease the programing, change the index of repacked_arg_vars
for these variables.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18832 >
2022-10-27 07:35:01 +00:00
Qiang Yu
66d1fa9666
ac/nir/ngg: save and restore no_varying/no_sysval_output
...
These are used by radeonsi for param export count, should
be saved and restore.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18832 >
2022-10-27 07:35:01 +00:00
Qiang Yu
b197dd0d15
ac/nir/ngg: allow passthrough with vs primitive id output
...
vertex primtive id and passthrough are not exclusive, just need
to get correct vertex index when passthrough.
radeonsi won't disable passthrough when vs primitive id output,
this is also for fixing the crash of the assertion.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18832 >
2022-10-27 07:35:01 +00:00
Qiang Yu
e536d0fe4b
ac/nir/ngg,radv: move LDS layout calculation out of nir ngg lowering
...
Use lds base load intrinsics in nir ngg lowering to get layout, left
its calulation to driver.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18832 >
2022-10-27 07:35:01 +00:00
Qiang Yu
54eea0e393
ac/nir/ngg: pass primitive_id_location as param for nogs lower
...
radeonsi need to use packed driver location for all outputs,
while radv need to use VARYING_SLOT_*. To meet both drivers'
needs.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18832 >
2022-10-27 07:35:01 +00:00
Qiang Yu
d82b668bc6
ac/nir/ngg: support user edge flags for ngg lower
...
Pack user edge flag into arg code is ported from radeonsi
gfx10_ngg_build_export_prim.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18832 >
2022-10-27 07:35:01 +00:00
Qiang Yu
238eeeacb2
ac/llvm: get back intrinsics used by NGG
...
Will be used by radeonsi.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18832 >
2022-10-27 07:35:01 +00:00
Lionel Landwerlin
53a0804146
radv: tweak lower_shader_calls parameters
...
On Q2RTX shaders :
MaxWaves: 62 -> 69 (+11.29%)
Instrs: 41626 -> 41575 (-0.12%); split: -0.27%, +0.15%
CodeSize: 224960 -> 223740 (-0.54%); split: -0.62%, +0.08%
VGPRs: 800 -> 704 (-12.00%)
Scratch: 75776 -> 70656 (-6.76%)
Latency: 922219 -> 977997 (+6.05%)
InvThroughput: 212154 -> 201746 (-4.91%); split: -5.54%, +0.64%
VClause: 1120 -> 1155 (+3.12%); split: -1.88%, +5.00%
SClause: 1148 -> 1144 (-0.35%); split: -0.70%, +0.35%
Copies: 5840 -> 5788 (-0.89%); split: -0.94%, +0.05%
PreVGPRs: 753 -> 651 (-13.55%)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16556 >
2022-10-26 12:53:26 +00:00
Lionel Landwerlin
1d10d17817
nir/lower_shader_calls: add an option structure for future optimizations
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16556 >
2022-10-26 12:53:25 +00:00
Samuel Pitoiset
db573f7362
aco: add support for device clock on GFX11
...
According to LLVM, s_sendmsg_rtn(GET_REALTIME) should be used instead
of s_memrealtime.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19267 >
2022-10-25 20:23:08 +02:00
Samuel Pitoiset
c481978ac2
aco: split the sendmsg enumeration into sendmsg_rtn
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19267 >
2022-10-25 20:23:07 +02:00
Samuel Pitoiset
6630b6e2aa
aco: add support for s_sendmsg_rtn_b{32,64}
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19267 >
2022-10-25 20:23:05 +02:00
Samuel Pitoiset
3a3df9acda
ac/llvm: add support for device clock on GFX11
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19267 >
2022-10-25 20:22:48 +02:00
Rhys Perry
1c005e72f4
ac/nir: add legacy streamout and GS copy shader helpers
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19302 >
2022-10-25 17:35:08 +00:00
Rhys Perry
382831c986
radv,nir: add intrinsics for streamout and GS copy shaders
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19302 >
2022-10-25 17:35:08 +00:00