This is pretty basic (and a bit crappy at the moment). I think we
might want some sort of overlay in the future and also be able to
trigger captures with F12 or whatever.
To record a capture, set RADV_THREAD_TRACE to something greater than
zero (eg. RADV_THREAD_TRACE=100 will capture frame #100). If the
driver didn't crash (or the GPU didn't hang), the capture file
should be stored in /tmp.
To open that capture, use Radeon GPU Profiler and enjoy your
profiling times with RADV! \o/
Note that thread trace support is quite experimental, only GFX9 is
supported at the moment, and a bunch of useful stuff are still missing
(shader ISA, pipelines info, etc). More is comming soon.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3900>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3900>
SQTT is also a file format (.rgp extension) that can be consumed
by Radeon GPU Profiler for profiling purposes.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3900>
Thread trace markers (also called events in Radeon GPU Profiler)
should be emitted after every draw/dispatch calls to collect data.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3900>
SQTT is a hardware block that collects thread trace data (like
wave occupancy, timings, etc) for every draw/dispatch calls.
It's only supported on GFX9 at the moment but I will add other
generations support soon.
This is the first step towards profiling with RADV!
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3900>
The previous instance of this comparision was 1u to avoid the warning, fix
this one too.
Fixes: dba71de5c6 ("aco: only create parallelcopy to restore exec at loop exit if needed")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3607>
gfx9.surf_pitch is supposed to be in blocks (or elements) but addrlib
returns a pitch in pixels.
This cause a mismatch between surface->bpe and surface.u.gfx9.surf_pitch.
For subsampled formats like uyvy (bpe is 2) this breaks in various places:
- sdma copy
- video rendering (see issue https://gitlab.freedesktop.org/mesa/mesa/issues/2363)
when the vl_compositor_gfx_render method is used
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3738>
16-bit med3 is only supported on GFX9+.
Fixes dEQP-VK.spirv_assembly.instruction.amd_trinary_minmax.mid3.f16.*.
Fixes: d6a07732c9 ("ac: use llvm.amdgcn.fmed3 intrinsic for nir_op_fmed3")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3962>
Lower 64-bit fmed3 because LLVM doesn't expose an intrinsic.
Fixes dEQP-VK.spirv_assembly.instruction.amd_trinary_minmax.mid3.f64.*.
Fixes: d6a07732c9 ("ac: use llvm.amdgcn.fmed3 intrinsic for nir_op_fmed3")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3962>
Both RadeonSI and RADV use the WGP mode, so we can assume 128KB on
GFX10.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3899>
It was disabled because untested, but CTS is happy with it.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3808>
sisched is completely unmaintained, it used to give few more FPS
in the past but with ACO, it's now obsolete. It seems even faster
without sisched now.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3789>
The hardware supports wide lines and the granularity is way larger.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2982>
Seems required also on GFX8-GFX9 to achieve correct behaviour. This
is an undocumented behaviour but it makes real sense to me.
pipeline-db on GFX9:
Totals from affected shaders:
SGPRS: 1018 -> 1018 (0.00 %)
VGPRS: 516 -> 516 (0.00 %)
Code Size: 40516 -> 40636 (0.30 %) bytes
Max Waves: 280 -> 280 (0.00 %)
This fixes some sort of sun flickering with Assassins Creed Origins.
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2488
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3750>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3750>