In the future this might even do something clever.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40835>
While Jay supports subgroups, efficient reductions are TODO so it's probably
better not to run this pass yet.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40835>
Jay will need more work to handle these payloads properly especially in SIMD32.
For now just disable the optimization for Jay for correctness.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40835>
This is a 2x16 bitpacked version of load_pixel_coord which maps directly to the
hardware value and is much easier for Jay to consume due to the sadness that is
true 16-bit on Intel. Jay will lower to this internally.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40835>
Jay will use this to lower & optimize subgroup shuffles. This is closer to
how Intel hardware works but still much higher level than the hardware
primitive. This gets us NIR optimizations on the multiply however.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40835>
This exposes the underlying render target write message directly, which Jay will
use to lower RT writes in NIR. I'm still on the fence about what exactly this
should look like but this is good enough for GLES3.0 (so, multiple render
targets but not necessarily dual source blending).
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40835>
This maps directly to what Intel's thread payload gives us, allowing us to
optimize out frcp's in some cases. Jay will use this.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40835>
These lowered versions map to what Jay can deal with. The hardware is more
flexible but we're not due to data model restrictions. We choose to lower to get
us off the ground, we can revisit later.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40835>
The immediate fs_clear_color shader uses IMM[0] but still declares
CONST[0][0]. That can make drivers try to read a fragment constant
buffer even though one is never uploaded on this path. Only declare
CONST[0][0] when the shader actually uses a constant buffer.
Fixes: 2ff9fa8b72 ("gallium/u_blitter: add a new fs_color_clear variant")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40760>
On big-endian hosts, r300 handles A8R8G8B8 and X8R8G8B8 by using
DWORD swap and programming component order as the matching B8G8R8A8 or
B8G8R8X8 formats. Reuse the same mapping when packing CBZB clear colors.
Fixes the bad lower-screen colors in Extreme TuxRacer on RV350.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40885>
The ordered atomic commits the post-add offset to memory, but overflow was computed using the pre-add offset, causing partial overflows to be missed and counters to become corrupted.
Fixes: "KHR-GL46.transform_feedback_overflow_query_ARB.multiple-streams-one-buffer-per-stream" based on the postwrite buffer offset, rather than the offset before the current workgroups writes.
Reviewed-by: Marek Olsak <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40745>
Some video decoders spit out AFBC(16x16,sparse,split) images. Advertise
support for this modifier so we can import such images.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40746>
Remove unused includes or heavy includes (e.g. `tu_common.h`) when
we could have done with lighter ones.
iwyu was used to find these cases.
Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40853>
Add missing folder patterns, and make the `^<vulkan/` pattern
apply to system includes too, so that all system includes are
in one group.
Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40853>
This avoids wasting CI time by catching the error early. We do
still need the meson test to catch these issues locally when
rebuilding from an already configured build directory though.
Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40853>
Add a test to ensure that we're always using one of the wrapper
files instead of including the XML generated headers directly.
Assisted-by: Opencode (MiniMax M2.7)
Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40853>
Add some wrapper header files so that we always include everything
that's needed by the generated header. This is in preparation for
setting up a script which enforces using these instead of importing
the xml generated headers directly.
Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40853>
qfb shortcuts the synchronous query wait, so venus might be unable to
populate the device lost error from the host Vulkan driver. This change
emits a host call upon vn_relax warn order for that purpose. If the host
call ends up successful, we double check the qfb availability for
consistency to avoid silent regressions in qfb code path.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40790>
This is possible with two vectors which share a temporary, though I don't
think it currently happens in practice.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40825>
Implements ops without needing the NIR lowering.
The sum and carry parts can later be combined into single instruction.
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Tested-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40607>
this ensures we don't see vec5 @load_ssbo_uniform_block_intel which
requires special backend handling, instead rounding up in NIR to vec8
which the LSC can do. affects
dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.ivec3_lowp_compute.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40877>