Added a workaround for a known shader in Horizon Forbidden West that causes
visual corruption on Intel anv driver. The fix clamps fsqrt inputs using
fmax(x, 1e-12) to avoid invalid values. Integrated the workaround via
brw_nir_apply_sqrt_workarounds() and applied it conditionally in the Vulkan
pipeline based on the shader's BLAKE3 hash.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12555
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36990>
GLSL arrays are sized by signed integers. Other places in the Mesa treat
this value is int32_t. Having the value larger than (uint32_t)INT32_MAX
causes assertion failures in many piglit tests.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Closes: #13873
Fixes: 2f8b8649f0 ("iris: Increase max_shader_buffer_size to max_buffer_size")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37282>
VirtIO devices can be configured as platform devices instead of PCI,
which is especially common in microVM projects like libkrun.
Let's allow RADV to probe MMIO virtgpu devices.
Signed-off-by: Val Packett <val@invisiblethingslab.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37281>
Fixes bo leaks caused by wrong ref counting when
using temporary sampler views.
The leak was observed when playing videos using the
DRM_FORMAT_MOD_VENDOR_MTK tiling format
through gstreamer.
Fixes: 0795f3d7e4 ("panfrost: Add a pool to sampler_view")
Signed-off-by: Christian Meissl <meissl.christian@gmail.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37325>
Between this and the previous few commits, the Box<> has moved from
outside of Instr to inside the Op enum. This provides a few benefits:
1. We no longer need to allocate for the worst-case Op on every
instruction. For example, OpIAdd3X is 232 bytes and OpBra is 40
bytes, which means we can save some memory on OpBra if we only
allocate those 40 bytes.
2. The Op discriminant is available without chasing a pointer. The type
of op is probably the most frequently used field, and this should
benefit most passes that care about what type of Op they're
handling.
3. Small Ops don't need any indirection at all. For example, OpPBk is
only 4 bytes, which means we can just store it directly in less
space than a pointer.
Compared to Box<Instr>, this is around a 1.4% shader compile time
improvement.
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Seán de Búrca <leftmostcat@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37315>
We can skip creating an &dyn and instead use a match to dispatch into
the inner type's functions.
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Seán de Búrca <leftmostcat@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37315>
This fixes new VKCTS coverage
dEQP-VK.descriptor_indexing.non_uniform_atomics.
Found this while implementing a new extension.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37295>
The border color index must be captured and returned to the app,
otherwise it's broken if samplers aren't replayed in-order.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37291>
When recreating aux_context (typically following a GPU reset event),
the u_log_context must be properly initialized based on the aux_debug
enablement state, just like the logic in the function
radeonsi_screen_create_impl() during aux_context creation. Otherwise,
wrong u_log_context setting will lead to Mesa crashes. For example, in
si_blit_decompress_color(), it will check sctx->log to decide whether
to call u_log_printf(), which ultimately causes a crash.
Signed-off-by: Trigger Huang <Trigger.Huang@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37296>
Move from a single ralloc allocation per instruction to contiguous
blocks of allocations. Still use ralloc for those large blocks.
Each ralloc allocation has at least 5 pointers of overhead, which would
be about a third of the current brw_inst, and get worse as we try to
pack brw_inst better.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36730>
In Release build, goes from 72 to 64 bytes, and now fits
in a single cacheline.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36730>
Now that all the other kinds were added, all transforms to SEND will
come from non-BASE kinds, so we don't need overallocate for BASE
instructions.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36730>
This kind of instruction doesn't have a special struct but
will still be always allocated so that it can be lowered to SEND.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36730>
Fixed the types in brw_inst::bits so the struct is packed correctly.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36730>