The pass implements two mitigations for the GFX6-7 SMEM bug:
1. To mitigate VM faults by NULL descriptors:
Make sure that SMEM buffer loads always access a mapped BO.
Use either the descriptor BO (or compute scratch BO),
or otherwise use the zero-filled BO in their place.
2. To mitigate VM faults by OOB robust buffer access:
Add an instruction to clamp the offset source to the
num_records field of the descriptor. It will be still
out of bounds, but the VM fault can be completely mitigated
if the driver adds a padding to each memory allocation.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38769>
On GFX6-7, SMEM instructions access memory when num_records == 0
or offset >= num_records, which causes VM faults when reading a
page that isn't mapped.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38769>
Subgroup ops make divergence information useless for our purpose,
we would need workgroup divergence.
The game affected here has control flow dependent on vote_any,
so it's possible that a wave only executes the code after culling/reordering
invocations.
That means we can't reuse the maybe undefined value from before culling.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14459
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39060>
Performance counters are too different between generations and it's
less error prone to define them separately for each generations.
I'm starting with GFX12 first.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39083>
Replace draw_entry_bytes with AC_TASK_DRAW_ENTRY_BYTES.
This is 16 on all AMD HW that supports task/mesh shaders.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39032>
The pass now uses the ring descriptors to figure these out.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39032>
Currently the number of task payload entry size is hardcoded
in shaders as a constant. This isn't a good idea because it
makes the code inflexible, eg. doesn't allow us
to change the number of entries dynamically.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39032>
Currently the number of task shader ring entries is hardcoded
in shaders as a constant. This isn't a good idea because it
makes the code inflexible, eg. prevents us from using the same
shader binary accross some chips as well as doesn't allow us
to change the number of entries dynamically.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39032>
These aren't new in RGP 2.6, they have been added since a while. But
because RADV wasn't supporting the new derived SPM chunk it wasn't
possible to expose them.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013>
This is the new method to add performance counters to RGP captures.
This will be used to add the new RGP 2.6 counters too.
The previous SPM code will be deprecated at some point but it's hard
to support all generations in one batch. So, I will implement this
step by step.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013>
Some blocks have two or more SPM counters and they should be used when
more than 4 counters are programmed (ie. 16-bit per counter).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013>