In schedule live value tracking, differentiate between half vs full
precision. Half-precision live values are less costly than full
precision.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3989>
Current scheduler thresholds try to ensure there are warps available to
switch to when hiding texture fetch latency. But if there is none to
hide, we should allow scheduler to use more registers to reduce nops.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3989>
To avoid spurious sync flags, we want to, for a6xx+, operate in terms of
half-regs, with a full precision register testing the corresponding two
half-regs that it conflicts with.
And while we are at it, stop open-coding BITSET
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3989>
Newer a6xx no longer has programmable GMEM base, so we can't rely on the
kernel driver setting it to 0x100000 (GMEM base is 0 on such GPUs).
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3979>
Register packing macros makes this only set the first bit. Set to whole
dword to fix srgb for color attachments >0.
Fixes: 59f29fc8 ("turnip: Convert the rest of tu_cmd_buffer.c over to the new pack macros.")
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3979>
I had some trouble because I assumed this was right, tested that the
alignment requirement is actually 16.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3979>
These formats are an exception that can't be modeled in the current format
table. Switch to a table with only a single a6xx_format per vk format,
and deal with the exceptions separately (currently the only exception is
10_10_10_2_UNORM which has a different color format).
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3806>
This hooks tessellation into various parts of draw, so the
tessellation shaders are used in the correct places as the
last shader of the pipeline.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3841>
This is the bulk of the llvm shader builders and tessellation
execution code.
TCS uses a coroutine launcher like compute shaders to handle
barriers. It executes 4-wide with one input vertex per lane.
Tessellation happens before the TES is run.
TES is just a 4-wide launcher, one per primitive is executed,
with one lane per tessellation coordinate input.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3841>
This adds the initial draw_tess.h with a define needed
for the interfaces. TCS input array doesn't need to handle
patch inputs so can be smaller.
The TCS context has some dummy values to align the textures/images
properly.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3841>
This add support for the tessellation i/o callbacks.
Tessellation requires another level of indirect indexing,
and allows fetches from shader outputs.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3841>
This adds the same tessellator code that swr uses, swr should
move to using this copy, unfortunately that wasn't trivial on my first
look.
The p_tessellator wrapper wraps it in a form that is a useful interface
for draw.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3841>
This is pretty basic (and a bit crappy at the moment). I think we
might want some sort of overlay in the future and also be able to
trigger captures with F12 or whatever.
To record a capture, set RADV_THREAD_TRACE to something greater than
zero (eg. RADV_THREAD_TRACE=100 will capture frame #100). If the
driver didn't crash (or the GPU didn't hang), the capture file
should be stored in /tmp.
To open that capture, use Radeon GPU Profiler and enjoy your
profiling times with RADV! \o/
Note that thread trace support is quite experimental, only GFX9 is
supported at the moment, and a bunch of useful stuff are still missing
(shader ISA, pipelines info, etc). More is comming soon.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3900>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3900>
SQTT is also a file format (.rgp extension) that can be consumed
by Radeon GPU Profiler for profiling purposes.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3900>
Thread trace markers (also called events in Radeon GPU Profiler)
should be emitted after every draw/dispatch calls to collect data.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3900>
SQTT is a hardware block that collects thread trace data (like
wave occupancy, timings, etc) for every draw/dispatch calls.
It's only supported on GFX9 at the moment but I will add other
generations support soon.
This is the first step towards profiling with RADV!
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3900>
I want to make sure that I don't introduce warnings in turnip where we
have active work going on, and I also want to make sure that the drivers
we care about testing are warnings-clean.
As with the previous -Werror change, this is for CI only and doesn't
affect end-user builds.
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3607>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3607>
The previous instance of this comparision was 1u to avoid the warning, fix
this one too.
Fixes: dba71de5c6 ("aco: only create parallelcopy to restore exec at loop exit if needed")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3607>