Since the stack pointer may wrap around the stack size in overflow
cases, traversal logic calculates the real stack pointer with
nir_umod_imm(b, stack, args->stack_entries * args->stack_stride).
For ray queries, "stack" was initialized to
"stack_base + local_invocation_idx * 4". This was completely broken, as
the umod would later delete the stack base completely and overwrite the
start of LDS, which belongs to the apps' shared memory.
Instead, add the stack base as a constant offset in the load/store_stack
callback. (This should also save 1 VALU per ray query)
Also, delete radv_ray_traversal_args::stack_base since it's unused now.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40420>
CB_RESOLVE isn't very fast and we already have two different paths,
it's been removed in hw since GFX11. PAL and RadeonSI removed support
for it too.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39957>
GFX12 encoding added one bit to the stack offset, doubling the limit on
the stack base offset that is possible to encode. In practice, this
always allows using bvh_stack_push* instructions on GFX12 since LDS is
still 64kB.
Cc: mesa-stable
Fixes: 59a39779 (radv/rt: Only use ds_bvh_stack_rtn if the stack base is possible to encode)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40405>
If a set layout is missing the driver can't compute the dynamic buffer
start offsets correctly. The only solution is to load these offsets from
an user SGPR.
To avoid adding more complexity, these offsets are re-emitted every
time dynamic buffers are dirty. That shouldn't matter because the
combination of dynamic buffers and independent sets is just super rare.
This fixes new VKCTS coverage
dEQP-VK.pipeline.pipeline_library.graphics_library.independent_sets_random.*.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39988>
This is possible since VK_KHR_maintenance10.
This fixes new VKCTS coverage in
dEQP-VK.pipeline.*.multisample.m10_resolve.*.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39956>
Since the stride is always 32 dwords, we need to treat the workgroup
size as multiples of that value. Using MAX2() only works for cases where
the workgroup size is less than 32, which was hit by some CTS with 1x1
workgroups.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39981>
The hardware only provides 13 bits for encoding the stack base (in
dwords). That translates to the stack base being required to be below
8192 dwords, or 32kB. It's possible to exceed this - LDS is 64kB after
all. Add an explicit check to make sure we don't end up with offsets
that overflow the hw's address fields. This fixes Metro Exodus Enhanced
Edition, which was using ray queries in a 1024-thread sized workgroup,
resulting in exactly 64kB of LDS being required for the stack.
This check isn't required for RT pipelines as we always use 32 or 64
wide workgroups with no other LDS used, so it's impossible to reach this
stack base limit.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39691>
It was incorrect to mark chit/miss arguments as discardable without
the equivalent in the traversal shader. Also, tail calls with modified
parameters that aren't marked discardable are incorrect.
This could lead to random corruption by clobbering parameter values
across two levels of nested calls: A Raygen shader calls traversal,
expecting e.g. the ray tMax parameter to be preserved. Traversal
overwrites the parameter's register with the hit t and tail-calls chit,
which immediately returns to raygen. Now the raygen shader still has the
clobbered tMax (which is actually the ray hit t) - if it calls traversal
multiple times, the second traversal iteration may use the previous
ray's hit t as tMax instead of the intended value.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39579>
There were two issues here:
1. Tail calls where the tail-callee receives modified parameters are
hazardous and only work if the parameter is return or discardable.
Otherwise, the caller of the function that executes the tail-call may
not expect some of the parameters to be clobbered.
2. There was also an indexing confusion with the call instruction vs.
call signature parameters. The call instruction has not been adapted
to the new lowered signatures, where the system args are prepended. To
make things clearer, split the loop into two, one iterating over
parameters in the call signature and one for parameters of the call
instruction.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39579>
Needs to consider the base offset, otherwise it's resolving to the
first 3D slice.
Fixes very recent VKCTS coverage dEQP-VK.pipeline.*.multisample.m10_resolve.*.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39393>
We can express any-hit/intersection shaders as functions, too.
Any-hit/Intersection shaders need the usual parameters like launch
IDs/descriptor data/ray properties, origin, direction/etc., but also
some special parameters related to traversal state. Any-hit/intersection
shaders need to return whether the hit was accepted and/or traversal
should be terminated, as well as the intersection T value (for
intersection shaders). Both any-hit and intersection shaders also need
to be passed hit attributes via parameters. Closest-Hit shaders need
those too, but we pass them out-of-band via LDS. LDS is used for the
traversal stack when any-hit/intersection shaders, so we need to pass
them via parameters.
Hit attributes are similar to ray payloads in the sense that they're
dynamically sized depending on how much space the application uses.
However, unlike ray payloads, hit attribute sizes have a strict upper
bound of 8 dwords. To make managing parameters easier, we put all hit
attributes in a single vector parameter with 0-8 components. This
prevents having a function with two sets of arbitrary numbers of
parameters.
This commit sets up ahit/isec function signatures and implements
lowering for ahit/isec-specific intrinsics in the context of these
functions. Subsequent commits will merely have to call into these
functions to execute a separate-compiled any-hit/intersection shader.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39314>
terminate_ray should only return from any-hit shaders, it should not
skip the intersection shader. If we insert a nir_jump_return when
processing the already-inlined any-hit shader, the intersection shader
will be skipped.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39314>
This is unused by any callers currently, but will be useful for nir
algebraic pattern testing, and as a way to turn our comments in
nir_opcodes.py into actual C code. For now, always returns false.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39076>