For emitting VK_INDIRECT_COMMANDS_TOKEN_TYPE_STATE_FLAGS_NV.
The scissor workaround for GFX9 is only needed if the state is emitted,
so move it there as well.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23584>
this fixes the case where a draw without sample shading precedes
a draw with sample shading without changes to the sample mask
Fixes: cc9e958053 ("lavapipe: fix DS3 min sample setting")
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23663>
The spec does not have such requirement, but anv requires it for
validating the offset. However, for DRM_FORMAT_YVU420, chroma channels
can be swapped upon import to match B/R channel order of
VK_FORMAT_G8_B8_R8_3PLANE_420_UNORM.
This fixes some sw codec path in Instagram when interop with gpu.
v2: fix image memory requirement for re-ordered explicit import
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Emma Anholt <emma@anholt.net> (v1)
Reviewed-by: Matt Tuner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23643>
We have a ubfe_imm helper that creates ubfe ops. Not all drivers support ubfe,
however, as it requires SM5 semantics. A few drivers support oly
ubitfield_extract. They should still get the convenience of an _imm helper, so
add a symmetric helper.
It might be nice to unify these helpers into a single helper that asserts its
inputs do not overflow (such that the two ops become equivalent) and emits
either ubfe or ubitfield_extract depending on the underlying driver. That is
left for future work as it's unclear exactly what naming/semantics we want.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23351>
Seen as a firmware assert when using a debug build of the firmware
and tested against:
dEQP-VK.pipeline.monolithic.render_to_image.core.1d_array.huge.width_layers.r8g8b8a8_unorm_d16_unorm
Signed-off-by: SoroushIMG <soroush.kashani@imgtec.com>
Acked-by: James Glanville <james.glanville@imgtec.com>
Reported-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23651>
In function ‘lower_load_kernel_input’,
inlined from ‘clc_nir_lower_kernel_input_loads’ at ../src/microsoft/clc/clc_nir.c:205:28:
../src/microsoft/clc/clc_nir.c:169:7: warning: ‘base_type’ may be used uninitialized [-Wmaybe-uninitialized]
169 | glsl_vector_type(base_type, nir_dest_num_components(intr->dest));
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../src/microsoft/clc/clc_nir.c: In function ‘clc_nir_lower_kernel_input_loads’:
../src/microsoft/clc/clc_nir.c:151:24: note: ‘base_type’ was declared here
151 | enum glsl_base_type base_type;
| ^~~~~~~~~
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23666>
Also remove unused backend CEIL lowering.
Single regressed gnome-shell shader due to fceil followed by f2i32
where before nir_lower_int_to_float would recognize that we already
have integer and emit mov instead of trunc for the f2i32. We can
clean this up easily once we move ntt to the backend.
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23642>
According to the HW docs, TGL+ no longer requires that a CS stall be
added to a texture cache invalidate done in the compute pipeline.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18725>
The render target flush would have been needed if it was possible to:
1) pollute the render cache and write to the data port in one draw
call.
2) perform a subsequent operation that assumed the render cache was
up-to-date.
However, this is not possible for the two glMemoryBarrier barrier bits
that get translated to this pipe barrier:
* GL_TEXTURE_FETCH_BARRIER_BIT is only used for sampling operations.
It's possible to pollute the render cache and data cache with writes
to a texture in one draw call (1). However, the GL spec states that
apps cannot assume that any existing render caches are up-to-date for
sampling the written locations immediately afterwards. Apps are
required to use glTextureBarrier before the sampling operation, so
requirement #2 is not satisfied.
* GL_PIXEL_BUFFER_BARRIER_BIT could be used for a PBO upload (2), but
it's not possible to pollute the render cache and data cache with a
PBO access in one draw call. PBOs cannot be bound to framebuffers
for rendering, so requirement #1 is not satisfied.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v1)
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18725>
This reverts commit 0523607ebb.
The issue that commit worked around seems to have been fixed as of
commit 1c8b4940eb ("iris: Emit flushes for push constant source
buffers"). I could no longer reproduce it from that point onward with
this revert applied.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18725>
For 32bpc formats, the ICL+ sampler fetches the raw clear color dwords
used for rendering instead of the converted pixel dwords typically used
for sampling. The CLEAR_COLOR struct page documents this for 128bpp
formats, but not for 32bpp and 64bpp formats.
In blorp_copy, map R11G11B10_FLOAT to R8G8B8A8_UINT instead of R32_UINT.
This will cause the sampler to fetch the clear color pixel, allowing
drivers to keep clear color support enabled during copies.
If iris is forced to convert blits to copies, this patch fixes the
following test on gfx12:
dEQP-GLES3.functional.fbo.color.repeated_clear.blit.rbo.r11f_g11f_b10f
At the moment, both iris and anv won't hit this issue outside of
blorp_copy. This is due to the read/write access restrictions they
currently place on texture views that reinterpret the surface format.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8964
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23604>
isl will assert out otherwise. Hit this with intel_stub_gpu on arm64, but it is
a legitimate bug since someone might plug a DG2 card into a workstation-grade
arm64 or ppc64 supporting PCIe (it exists).
This forward ports the logic from crocus, which checks for both SSE at a
compile-time level as well as in the CPU caps. This might be excessive since DG2
cards apparently wouldn't work properly on old non-SSE x86 boxes anyway? I just
crocus-and-pasted.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23608>