Make sure that lowering undone in elk_nir_optimize are reapplied.
No shader-db or fossil-db changes on any Intel platform. This is most
likely to impact either Gfx8 on ANV or Gfx7.5 on HASVK. I don't
fossil-db test either of those platforms.
I tried doing a similar thing here as is done in BRW (previous commit),
but that caused a couple Haswell shaders to fall off a performance
cliff:
total spills in shared programs: 8247 -> 8311 (0.78%)
spills in affected programs: 6 -> 70 (1066.67%)
helped: 0 / HURT: 2
total fills in shared programs: 8558 -> 8910 (4.11%)
fills in affected programs: 6 -> 358 (5866.67%)
helped: 0 / HURT: 2
Fixes: 442daeb54a ("nir/opt_algebraic: use fcanonicalize")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
(cherry picked from commit df704bd38e)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
The number of fields comes from the shader, so it could be a value large
enough that using alloca would be problematic.
Fixes: c11833ab24 ("nir,spirv: Rework function calls")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 9017d37e84)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
The number of fields comes from the shader, so it could be a value large
enough that using alloca would be problematic.
Fixes: 2a023f30a6 ("nir/spirv: Add basic support for types")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 3da828d2dd)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
This was incorrect for OpenCL due to the possibility of variable shared memory
existing despite shared_size == 0. Fortunately the optimization it was trying to
do should be done in NIR via nir_opt_barrier_modes so we can just drop the brw
code and move on with our merry lives. Fixes OpenCL tests on Iris:
non_uniform_work_group non_uniform_3d_barriers
basic async_strided_copy_local_to_global
Cc: mesa-stable
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit bd5ebbb2f8)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
This fixes new VKCTS coverage
dEQP-VK.api.copy_and_blit.core.use_after_copy.*.
is_stencil isn't set for RadeonSI because it doesn't do SDMA copies
with Z/S.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 1be4ffdff9)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
Caught through VVL test NegativeWsi.SwapchainImageFormatList. The test
would try to create a swapchain with a color space from
VK_EXT_swapchain_colorspace without enabling the extension. This is
because wsi would expose those color spaces even when the extension was
not enabled.
Fixes: fd045ac99c ("wsi/metal: add support for color spaces")
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Signed-off-by: Aitor Camacho <aitor@lunarg.com>
(cherry picked from commit e6f118f12b)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
A descriptor buffer promoted to push constants requires a constant
cache invalidation if it is modified on the device.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 42b70cf05a)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
Given a situation like this :
- CB_A: begin, renderDepthA, end
- CB_B: begin, computeA, barrier (depth), computeB, end
The depth cache is not being flushed between renderDepthA & computeB
because :
- it's not flushed at the end of CB_A (it's not required)
- when CB_B starts, we're still on GFX pipeline mode but do not
flush render caches because pipeline mode is unknown
- when barrier is CB_B is executed, we're already in compute
pipeline mode and HW cannot flush depth.
The fix is to flush RT/depth cached when switching from unknown
pipeline mode any pipeline mode.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: e6dae6ef5f ("vulkan: Optimize implicit end_subpass barrier")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14816
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Tested-by: David Gow <david@davidgow.net>
(cherry picked from commit 888ac904a3)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
This bit is set in mocs for other protected attachment types by
anv_image_fill_surface_state() however was ommited for depth/stencil
attachments here.
Without the protected bit set, it causes heavy black artifacting when
attaching a protected depth attachment image to a framebuffer.
Fixes: 794b0496e9 ("anv: enable protected memory")
Signed-off-by: Juston Li <justonli@google.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit f84ed620c2)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
`operands_match` was modifying instruction source operands in-place
(through the `elk_fs_reg *src` pointer member) and relying on a
save/restore pattern to undo the modifications. Work on local copies
instead, which is simpler and avoids mutating shared state in a
comparison function.
Fixes: 47c4b38540 ("i965/fs: Allow CSE to handle MULs with negated arguments.")
(cherry picked from commit 14c65322e8)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
The MUL case in `operands_match` was reading and writing the `.f` union
member unconditionally, even when the register's `.file != IMM`. In that
case `.f` aliases the struct containing `.nr`/`.swizzle`/etc, so the
`fabsf()` call could corrupt the `.nr` by clearing bit 31.
Guard all `.f` accesses with `.file == IMM` checks.
Fixes: 47c4b38540 ("i965/fs: Allow CSE to handle MULs with negated arguments.")
(cherry picked from commit 93f39f87c4)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
`operands_match` was modifying instruction source operands in-place
(through the `brw_reg *src` pointer member) and relying on a
save/restore pattern to undo the modifications. Work on local copies
instead, which is simpler and avoids mutating shared state in a
comparison function.
Fixes: 47c4b38540 ("i965/fs: Allow CSE to handle MULs with negated arguments.")
(cherry picked from commit b302faad8b)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
The MUL case in `operands_match` was reading and writing the `.f` union
member unconditionally, even when the register's `.file != IMM`. In that
case `.f` aliases the struct containing `.nr`/`.swizzle`/etc, so the
`fabsf()` call could corrupt the `.nr` by clearing bit 31.
Guard all `.f` accesses with `.file == IMM` checks.
Fixes: 47c4b38540 ("i965/fs: Allow CSE to handle MULs with negated arguments.")
(cherry picked from commit f5e0f63216)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
Could do better by checking which registers are clobbered/preserved,
but that's unlikely to be useful anyway.
Backport-to: 26.0
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
(cherry picked from commit fc7b5d7eed)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
The kernel capabilty has the `FPFastMathMode` decoration, but not the
`FPFastMathDefault` execution mode, so a SPIR-V module not using
`SPV_KHR_float_controls2` has no way of setting any defaults.
Fixes: 9da2d21804 ("vtn: implement default fp_math_ctrl without using execution mode")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Tested-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
(cherry picked from commit faf3a93e8f)
[Eric: adjusted commit because of missing 46a617884e, as suggested by the author
at https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39790#note_3325830]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
I somehow screwed this up on my previous attempt at fixing this bug,
This should fix the loop limiter bug on big endian properly.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Cc: mesa-stable
Fixes: e28cfb2bad ("gallivm: handle u8/u16 const loads properly on big-endian.")
(cherry picked from commit c016346b50)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
When a buffer is deleted, we have to remove it from all binding points.
We were re-using the code for BindBufferRange for this; however, this
caused the general binding point to be unbound (bound to NULL)
unconditionally, even if a different buffer is bound there. Fix this by
inlining the various bind calls into the delete buffers code.
cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14755
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
(cherry picked from commit fa418f1e73)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
Only for partial copies because image stores don't decompress on writes
(ie. HTILE isn't updated by image stores).
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 9f5a20abde)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
libclc doesn't so we have to. fixes math_brutefore cbrt on Iris.
Co-authored-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
(cherry picked from commit af954427bf)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
On the ppc64le architecture error log fail to compile with error:
../src/virtio/vulkan/vn_renderer_virtgpu.c: In function ‘virtgpu_ioctl_map’:
../src/virtio/vulkan/vn_renderer_virtgpu.c:751:66: error: format ‘%llu’ expects argument of type ‘long long unsigned int’, but argument 6 has type ‘__u64’ {aka ‘long unsigned int’} [-Werror=format=]
751 | "mmap failed: gpu_fd=%d, handle=%u, size=%zu, offset=%llu, err=%s",
| ~~~^
| |
| long long unsigned int
| %lu
752 | gpu->fd, gem_handle, size, args.offset, strerror(errno));
| ~~~~~~~~~~~
| |
| __u64 {aka long unsigned int}
cc1: some warnings being treated as errors
Parse the parameters to fix the failure.
Fixes: a49b7adad8 ("venus: add error log coverage for virtgpu backend")
(cherry picked from commit dd3fe2d671)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
There might be cases under which we can make this work but they're
tricky at best. For now, don't even try.
Cc: mesa-stable
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
(cherry picked from commit 918624174b)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
There's a nice little comment here saying we use the same write mask (an
out of date term in NIR) and swizzle but we're no longer actually doing
that. Depending on nir_builder magic, we may actually generate a scalar
when we really want a vector. The fix is to use more builder helpers
and just eat the potential copy.
Fixes: 3180656bbc ("nir: don't use nir_build_alu() with incomplete sources")
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
(cherry picked from commit 711b3358a8)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
the lowerings for e.g. f2f16_rtp have carefully written sequences using
Infinity. nir_opt_algebraic will stomp right through this. `feq x, inf`
without an exact flag is basically always a bug. Disable fast math here.
Fixes OpenCL CTS test_half on Iris.
Cc: mesa-stable
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
(cherry picked from commit 91550d0709)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
The passed flags is always zero on the import paths:
- panfrost_bo_import
- panvk_AllocateMemory
- panvk_GetMemoryFdPropertiesKHR
Fixes: 1c7793ea0b ("panvk: Advertise a HOST_CACHED memory type if we have WC maps")
Tested-by: Valentine Burley <valentine.burley@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
(cherry picked from commit 8d25f9821b)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
Fix compiler error:
../src/freedreno/decode/cffdec.c:580:7: error: assigning to 'char *'
from 'const char *' discards qualifiers
[-Werror,-Wincompatible-pointer-types-discards-qualifiers]
580 | p = strstr(name, "CONST");
| ^ ~~~~~~~~~~~~~~~~~~~~~
glibc now provides C23-style type-generic string functions. strstr
returns const char * when passed a const char * argument. Update p
declaration to const since it's only used for offset calculation.
Fixes: 1ea4ef0d3b ("freedreno: slurp in decode tools")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
(cherry picked from commit bc34a122f3)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>