There are new combinations of ordered and unordered dependencies
available for the instructions to use, which among others include:
- combining FLOAT and INT pipe deps in SENDs;
- combining SRC mode deps in regular instructions for the inferred type.
This patch enables a couple of tests checking for the first case.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31375>
A recent algebraic opt made a function that used to inline
with llvmpipe CL not inline anymore. However that function
has a barrier in it.
Handling barriers from inside a callstack is hard for llvmpipe
coroutines, so just force functions with barriers to be inlined.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32204>
We (QNX) is using this with our VMM, and our Linux reference distro (which is currently in development).
With libaemu removed, it's much easier to integrate into a Linux-guest
build.
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32161>
We can treat VK_QUEUE_FAMILY_FOREIGN_EXT as the host, This makes sure
that, on release, all subqueues self-wait and all caches are flushed.
On acquire, all caches are invalidated.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32164>
The spec says that the user-specified srcStageMask/srcAccessMask should
be ignored for the acquire operation and the user-specified
dstStageMask/dstAccessMask should be ignored for the release operation.
Since we don't need any special handling for VK_QUEUE_FAMILY_EXTERNAL,
override them to NONE.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32164>
Unless rustfmt gets informed that we use the 2021 edition, it chokes on
C-string literals.
Passing the `--edition` parameter with every invocation would be annoying.
Create a config file instead.
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31510>
In two functions implementing resource discard rebind_resource is called
on resource before its track record is reset. This prevents update of
dirty_resource or dirty_shader_resource because of conditions in
needs_dirty_resource. With rsc->track reset and dirty_resource bits
missing further calls to transfer_map will not try to reallocate
resource storage when needed.
A way to reproduce the issue in both functions is by executing at least
3 draws modifying bound texture or VBO each time. This patch fixes those
cases and some related piglit tests on a5xx and should fix it on other
GPUs. Also it fixes rendering in Firefox and vsraytrace (except vertical
line at right edge).
Fixes: 0a62a874fc ("freedreno: Re-work dirty-resource tracking")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10374
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32167>
Many debug flags influence shader codegen but are currently not included
in the hash key. This causes surprising effects as cache lookups may
return shaders compiled with different debug flags than currently in
effect. This patch fixes this by including all debug flags in the
shader hash key.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: c323848b0b ("ir3, tu: Plumb through support for per-shader robustness")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32052>
This is mostly a developer preference issue. Some Android
devs like to commit auto-generated files for ${reasons},
though the style of Mesa is not to do so.
I personally like the Mesa style, since otherwise a 25 million
LoC project would be 40 million, but whatever.
An easy solution to just to check them in AOSP Mesa, but not
in upstream. There are various mechanisms, particularly
auto-rollers, that enable this. For example, there is no plan to
check in Blueprint files upstream, but they will be checked-in
and committed by the auto-roller.
For the scheme to work, we'll need slightly different meson
rules when the build target is Android versus otherwise.
Reviewed-by: Marcin Radomski <dextero@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32148>
When batched descriptorset udpate optimization is turned
off, the descriptorset is not handled in snapshot.
This cl handles this situation.
Reviewed-by: Marcin Radomski <dextero@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32148>
For dispatchable handle, such as commandBuffer, it is always
left as boxed by decoder; consequently snapshoter should not
box it again.
Reviewed-by: Marcin Radomski <dextero@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32148>
Removed a function that creates anonymous file descriptors when called.
Additionally replaced a call of said function with the one from the "util"
directory. The intention is to avoid repeated functionality
util: Allow code to be compatible in c++ compilers
Added an extern "C" statement and preprocessor directives to make the
“os_create_anonymous_file” function compatible with c++ compilers
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32096>
Prevents the next patch from failing CTS tests such as:
dEQP-VK.api.image_clearing.core.clear_color_image.*.b4g4r4a4*
Brings back the feature that was introduced in commit 46187bb54f
("anv: Swizzle fast-clear values"), but went unused in commit
721d0c3e77 ("anv,hasvk: Always use BLORP_BATCH_NO_UPDATE_CLEAR_COLOR").
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32187>
We're going to drop a generic restriction on clear color conversions in
anv_can_fast_clear_color(). Without preparing for it, the following
tests would fail:
* piglit.spec.arb_framebuffer_srgb.blit texture srgb msaa disabled clear.gen9_zinkm64
* piglit.spec.arb_framebuffer_srgb.blit renderbuffer srgb msaa disabled clear.gen9_zinkm64
* piglit.spec.arb_framebuffer_srgb.blit texture srgb downsample enabled clear.gen9_zinkm64
* piglit.spec.arb_framebuffer_srgb.blit renderbuffer srgb downsample enabled clear.gen9_zinkm64
* piglit.spec.arb_framebuffer_srgb.blit renderbuffer srgb msaa enabled clear.gen9_zinkm64
* piglit.spec.arb_framebuffer_srgb.blit texture srgb msaa enabled clear.gen9_zinkm64
So, add support for sRGB sampling via BLORP transfer operations and drop
the gfx9-specific restriction on sRGB fast-clears.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32187>
We have to honor drivers when they say that different interpolation
qualifiers can't be mixed in the same vec4, indicated by
nir_io_has_flexible_input_interpolation_except_flat not being set.
This is a prerequisite for enabling nir_opt_varyings for all drivers.
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32174>
This was enabled by default in nir_opt_varyings, but vc4 can't handle
when shader outputs write Y but not X. Add an option for it and enable
it only for the driver that benefits from it.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32174>
The main difference between load_global and load_global_constant is that
the latter can be reordered arbitrarily. If the access being lowered is
already tagged as being reorderable, then we can preserve that by using
the load_global_constant intrinsics instead of load_global. This gives
us more flexibility.
On Intel, this lets us use the load_global_constant_uniform_block_intel
intrinsic for doing convergent block loads in more cases. This nets us
significant reductions in spill/fills: Borderlands 3 on Lunarlake sees
spills/fills reduced by 53%. Alchemist sees a 13% reduction.
Improves performance of Borderlands 3 DX12 on Intel Battlemage by
around 44%. Improves Hogwarts Legacy by around 14%.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31995>
We were accidentally doing a signed integer comparison here for ult32,
or a sign-extending shift for ushr.
One notable bit of fallout was that load_global_uniform_block_intel
address calculations broke on platforms that don't have native 64-bit
integer support, as the iadd64 lowering for "do I need to carry?" was
using ult32...and performing the wrong comparison. We spotted this in
Borderlands 3 on Alchemist once we turned on other optimizations.
Thanks to Lionel Landwerlin for helping spot the problem!
Fixes: c7b312ad45 ("brw: factor out source extraction for rematerialization")
Fixes: 339630ab05 ("brw: enable A64 loads source rematerialization")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31995>
This was triggering an assertion in the fs_builder::MOV helper that
the destination stride can't be 0 when dispatch_width > 1. What we
want to do is copy the single 64-bit channel of data from the UNIFORM
file to a VGRF. We can use a SIMD1 builder for that.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31995>