Both null def and op result in the same correct encoding, but these
instructions optionally read a sgpr, so it makes more sense to use an operand.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26163>
When lowering @load_constant to @load_ubo, the bit size is currently
hard-coded to 32. This causes validation errors when lowering a constant
with a 64b bit size.
This patch fixes this by setting the @load_ubo bit size correctly for
64b constants. This 64b load is later lowered to a 32b load by
ir3_nir_lower_64b_intrinsics.
Fixes Piglit test:
- spec@arb_gpu_shader_fp64@execution@fs-indirect-temp-double-src
This patch has no impact on shader-db.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26191>
Nobody has to advertize it as an extension, but here we are.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25701>
There is no real reason to prevent this as far as I know. And some of the
SPIR-V generated by DPCPP is running into this.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25701>
Shader Local Memory is what NVIDIA calls it in the shader header docs as
well as the command stream headers. Better to be consistent even if it
gets my Intel brain confused. (Intel uses SLM for shared memory.)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26197>
At this point, we're fully trusting NAK to do its own lowering and we
only lower stuff in nvk_shader.c if it's relevant for Vulkan. This also
assumes that NAK is already doing the right thing everywhere.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26197>
This reverts commit 578e10e157.
The only reason for reallocating surfaces as interlaced (on drivers
that supports both progressive and interlaced) was deinterlacing
with postproc filter, but that now also supports interleaved surfaces.
With this change interlaced surfaces are no longer used on radeonsi.
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26174>
The code here handled stores of actual 3-byte values (8-bit, 3-component), but didn't
correctly handle stores of larger 8-bit vectors that were constrained by write mask to
just 3 bytes. In that case, the pad-to-vec4 step was unnecessary and problematic.
Seen in CL CTS test_basic vector_swizzle test group for char3 with CLOn12.
Fixes: c70d94a8 ("nir_lower_mem_access_bit_sizes: Support unaligned stores via a pair of atomics")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26034>
This gives FS I/O the same treatment as we did for vertex attributes in
that we now have a NIR intrinsic which pretty closely matches the
hardware and we lower to that before going into NAK. This gives us a
bit more control in the NIR.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26181>
Currently, shader part caching logic is duplicated between VS prolog and
PS/TCS epilogs. This commit introduces a common abstraction to
deduplicate the code.
Additionally, there are a few design decisions that diverts from the
current implementation:
1. A simple mutex is used instead of reader-writer lock. Prolog/epilog
constructions are serialized, removing the need to free duplicate
objects in case of a race.
2. A CS-local cache is used to quickly lookup an entry without holding a
lock. This eliminates locking in over 99% of cases.
3. A set is used to reduce number of allocations.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26028>
Blob executes a special compute dispatch at the start of each
command buffers. We copy this dispatch as is. At this point
we don't know what this workaround is for.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25888>
For 0x07030002 chip id different names are returned on different
phones: Adreno730v3 or Adreno725v1. Settle on 725 to disambiguate
them.
The only difference from base 730 is that it has conditional
execution of compute shader at the start of every command buffer.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25888>
If we encounter an error during the startup we always want to have
it in the logs to quickly diagnose an issue from user attached logs.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25888>