For legacy (non-NGG) GS to lower outputs to memory stores and add
shader query when required.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20158>
To be shared by NGG GS and legacy GS. Legacy GS need this when
GFX10 which mix use NGG and legacy GS. For example when streamout
is enabled, it uses legacy GS, otherwise uses NGG GS. So legacy
GS also need to update query emulation which is a sum of NGG and
legacy GS results.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20158>
For used by legacy GS to store output to different ring according
to stream id.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20158>
For simple intrinsic which also need other fields to translate
to LLVM like stream_id.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20158>
Based on f129db911b ("radeonsi/gfx11: use a better workaround for the export conflict bug")
which claims better performance.
I couldn't be bothered to do the refactor to check the sample count with
dynamic sample counts, so this is just conservative there.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20273>
The previous check assumed the stack starts at offset=0, which isn't
necessarily true for ray queries.
Note that this didn't cause correctness issues, just made an optimization
not apply. Found when I accidentally made this load-bearing in a
refactor.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20265>
If shared_memory_explicit_layout=true, we would have skipped lowering task
payload variables to explicit types.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19597>
Delete radv_CmdDispatch, radv_CmdSetDeviceMask, and
radv_GetDeviceGroupPeerMemoryFeatures so that the vk_common_*
versions will be used instead. This will avoid repeated code.
Signed-off-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20218>
SCC is 1-bit, and we can't copy a 32-bit value into it.
Fixes dEQP-VK.spirv_assembly.type.scalar.i32.iequal_tesse with
ACO_DEBUG=noopt.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 9476986e6f ("aco/ra: special-case get_reg_for_create_vector_copy()")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20240>
This option is rarely or even never used but it was broken. While we
are at it, remove radv_ps_epilog_key::wave32 because the wave size
can only be changed globally for PS.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20199>
The use of std::optional simplifies expressions and would be useful for some
upcoming RA tweaks.
C++17 has been available since the merge of rusticl and should be safe to use as
far as packaging is concerned.
A few style choices are:
- Testing for emptiness uses implicit bool conversion.
- Constructing an empty value uses {}.
- Constructing a filled value uses the implicit conversion constructor.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20125>
If points or lines are drawn using the polygon mode, the guardband
should be adjusted for large points/lines.
Cc: 22.3 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20185>
Death stranding does scratch_arr[80-idx]. This doesn't seem to work if we
try to combine the subtraction into the access.
fossil-db (navi21):
Totals from 52 (0.04% of 135636) affected shaders:
Instrs: 78560 -> 79036 (+0.61%)
CodeSize: 427940 -> 431188 (+0.76%)
Latency: 1313809 -> 1318142 (+0.33%)
InvThroughput: 292833 -> 293842 (+0.34%)
VClause: 2361 -> 2555 (+8.22%); split: -0.51%, +8.73%
Copies: 8767 -> 8746 (-0.24%); split: -0.35%, +0.11%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 0e783d687a ("aco: use scratch_* for scratch load/store on GFX9+")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7735
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20117>