Aside from displaying in a tracepoint, it would be useful in order
to decide whether to disable LRZ for the whole renderpass.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32868>
If blend LRZ state was already calculated from static info,
re-calculating it with dynamic state would bring stale values
and therefor result in a wrong calculations.
This resulted in LRZ being disabled when it should have not in
native VK titles.
CC: mesa-stable
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32868>
Currently, when allocating the dst of shared collects, the registers of
its srcs would only be reused if they are killed. This results in a lot
of needless moves due to suboptimal register allocation.
This commit addresses this generically: allow unavailable registers to
be used for a new dst iff it shares a merge set with, and has the same
offset as, the currently live value.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32021>
Detecting non-trivial collects after the fact, i.e., once they are
created by a register assignment of their dst, does not work as this may
cause two different intervals to share a physreg. For example:
_meta:collect sssa_361:150(r50.x) (wrmask=0xf),
sssa_19:12(r50.x), sssa_103:13(r50.y),
sssa_355:102(r51.z), sssa_356:103(r51.w)
This is a non-trivial collect with a partial overlap with one of its
child intervals. After moving its dst to a new interval, it will have
the same physreg as the existing interval for sssa_19, causing all sorts
of trouble for RA.
Prevent this by detecting that a future collect may become non-trivial
at the moment one of its sources gets a register assignment that does
not correspond with it merge set's preferred reg and allocating a new
interval for this component.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: b36a7ce0f1 ("ir3/ra: prevent moving source intervals for shared collects")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32021>
Need to use mfence to strictly order mqd wptr update and ringing doorbell
in cpu. If the compiler or cpu re-orders it, commands will be missed.
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32700>
The userq packets are added using _pkt_begin(), _pkt_add(), _pkt_end()
functions. As of now _pkt_being() and _pkt_add() is called once. It
is not advisible to update wptr value in mqd multiple times. Hence use
next_wptr as cache in the macros and update mqd mptr before job submission
only once.
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32700>
The signal ioctl should only be called after guaranteeing that the hardware
started working on the submissions and that is only after doorbell is ringed.
Otherwise it can in theory happen that the application creates the fence and
is then interrupted before ringing the doorbell. That can result in a GPU
reset because the fence times out.
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32700>
Determine whether the device has hardware raytracing support early, and
then use this result where needed, instead of checking for `gfx_level`
every time.
This is a prerequisite for CYAN_SKILLFISH chip enablement. This chip is
still GFX10, not GFX10_3, but has hardware support for accelerated
`image_bvh{,64}_intersect_ray` instructions. Just checking for `gfx_level`
is insufficient for it.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33109>
This should let us lower the RT query stack to registers instead of
scratch by getting rid of the rest of the members of the ray query
struct. This gives a 24% decrease in total time for 3DMark InVitro.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28447>
Remove the redundant error that will never be hit in practice (because
if fd_dev_name() succeeds then so will fd_dev_info()) and move it up so
that we can use the info when generating the name.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28447>
There is native support for D3D-style untyped UAVs, which are an unsized
array of "records."
This will be needed for acceleration structures, because normal SSBO
descriptors aren't large enough to cover all the 128-byte instance
descriptors for the maximum number of instances (2**24).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28447>
Make sure that struct vk_bvh_geometry_data is defined before
vk_fill_geometry_data(), to fix this error:
In file included from ../src/freedreno/vulkan/tu_acceleration_structure.cc:29:
../src/vulkan/runtime/vk_acceleration_structure.h:138:1: error: 'vk_fill_geometry_data' has C-linkage specified, but returns incomplete type 'struct vk_bvh_geometry_data' which could be incompatible with C [-Werror,-Wreturn-type-c-linkage]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28447>
Found on simulation, complaining about SIMD32 shaders enabled when
using MSAA 16x.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30753>