vkcube, vkgears and vkmark are crashing with the following error/segfault:
$ NVK_I_WANT_A_BROKEN_VULKAN_DRIVER=1 vkcube
WARNING: NVK is not a conformant Vulkan implementation, testing use only.
Selected GPU 0: GeForce GT 640 (NVK GK107), type: DiscreteGpu
ERROR: couldn't get DataFile for op ldc_nv
Segmentation fault (core dumped)
Handling nir_intrinsic_ldc_nv as per nir_intrinsic_load_ubo in Converter::getFile()
allows to run vkcube, vkgears and vkmark on Nvidia GT640
Fixes: dc99d9b2 ("nvk,nak: Switch to nir_intrinsic_ldc_nv")
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30832>
LLVM optimizes umul_hi with a constant to v_mul_hi_i32_i24_e32 which isn't
always what we need here. This causes miscalculations. To prevent LLVM to
apply this optimization, we insert a optimization barrier.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11761
Suggested-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30810>
Previously, we trusted the caller to set the size to zero for null
descriptors. However, device-generated commands allows specifying a
zero device address with a non-zero size. We may as well handle this
case directly in the MME and keep our command emit shader simple.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30826>
On bifrost we only can use 3 coordinates for images, but
image2DMSArray needs 4 (x, y, sample#, and array index).
We work around this by making the image nr_samples times
higher than the original image, using the Y coordinate to
address the sample plane. This limits the maximum image
height (to 4K pixels instead of 64K pixels in the 16 sample
case) but at least allows us to use the images.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30521>
On valhall, the sample index should go in the R component
of the image load/store/lea instruction. This provides a
straightforward way to implement image2DMS and
image2DMSArray image load and store for valhall.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30521>
The nir_lower_image_atomics_to_global pass can create some image
load/stores, so we need to do the multisample image load/store
lowering after this.
Also, the pass only actually works on bifrost and below, so skip it
for valhall.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30521>
The pan_arch function is useful elsewhere, and doesn't rely on
anything else within genxml/gen_macros.h.
It's useful, for example, to find the architecture from the
GPU id in bifrost_compile.c, where before we were using ad-hoc
shifting.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30521>
The hardware's clear color conversion feature requires invalidating the
texture cache for every fast clear. We're no longer using the hardware
feature, so we longer need the invalidation.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30646>
Instead of using the clear color conversion feature by the hardware, use
software to write out the converted clear color pixel.
When testing a patch which moves a state cache invalidate to occur after
fast clears instead of before, this prevents the following failures on
tgl/zink:
* piglit.spec.arb_texture_cube_map_array.arb_texture_cube_map_array-cubemap
* piglit.spec.ext_framebuffer_object.fbo-generatemipmap-formats
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30646>
The hardware's clear color conversion feature unfortunately requires
invalidating the texture cache for every fast clear. To avoid the
performance penalty that comes with the invalidation, avoid using the
hardware feature and write out the converted clear color pixel
ourselves.
When testing a patch which moves a state cache invalidate to occur after
fast clears instead of before, this prevents the following failures on
icl/zink:
* piglit.fast_color_clear.fcc-read-after-clear sample tex
* piglit.spec.arb_clear_texture.arb_clear_texture-cube
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30646>
Prevents the next patch from failing many multisampled, signed integer
rendering tests. For example:
dEQP-VK.renderpass2.suballocation.multisample_resolve.r8_sint.samples_4
Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30646>
We don't have an RSD descriptor on Valhall, and the vertex
attributes are part of the driver descriptor set, which we
re-emit anyway.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30736>
Quite a few things are common to Valhall/Bifrost. Specialize what's
different, and move the move to the root driver directory.
We don't compile it on v10 as this requires the cmd_buffer bits
that are not yet defined.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30736>
When PANVK_DEBUG=dump, all internal buffers get dumped, but not the user
ones, because they don't have a host address attached to them. Let's
register one when mappings dump is enabled.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30736>
The hardware limit on Valhall is 16 descriptor tables, but we reserve
one for our internal descriptors (dummy sampler, vertex attributes and
dynamic buffers), which leaves us with 15 user descriptor sets.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30736>
The JM implementation of queue_finish() is simple enough to be inlined,
but that won't be the case of the CSF implementation. So let's make
this function per-arch so we can move it to panvk_vX_queue.c.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30736>
We will have a new BO pool for CS buffers in the CSF backend.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30736>
It makes the reset of command buffers a tad simpler, and it allows
re-cycling push sets.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30736>
Co-developed-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30736>
Co-developed-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30736>
Dynamic rendering local read allows the application to use subpass input
attachments with feedback loops. But unless legacy RPs where it's
possible to determine feedback look at creation time, with dynamic
rendering it's not possible.
To fix that, the driver needs to determine at draw time if a feedback
loop is present, and it needs to decompress DCC/HTILE if necessary.
See https://gitlab.khronos.org/vulkan/vulkan/-/issues/3928 for more
information.
Note that VKCTS is still missing coverage but this has been reported.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11127
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30124>
VK_KHR_dynamic_rendering_local_read allows the application to sample
from a subpass input attachment where this attachment is also the color
attachment (aka. feedback loop). With legacy RPs, it's easy to detect
that at RP creation time by looking at the input<->color indices but
with DRLR this needs to be determined dynamically.
This flag would help when legacy RPs are converted to dynamic rendering
because it's not possible to know if a subpass used input attachments.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30124>
Now that we are explicitly checking for all supported formats, start
rejecting anything that isn't supported. This should make it easier to
avoid accidentally support formats without enabling the right
extensions-bits first.
A few tests regress on Lima, because we (corretly) deny using GL_FLOAT
as a texture component type. This should be fixed in the Piglit case to
skip the test there instead, but for now let's just update and document
the change.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29835>
Half-float textures got a bit strange in GLES2; they were added by the
OES_texture_float extension, but that added a *different* enum with a
*different* value than what ended up in ARB_half_float_pixel and GLES3.
So, we need to check separately for these. The former one is only
supported on GLES.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29835>