This breaks CLOn12's handling of CL CTS test_basic vector_creation for char3 (at least).
Removing this cast causes us to try to load from a deref with no alignment info.
Fixes: 99bb2a4d ("nir/opt_deref: Don't remove casts with alignment information")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10165>
The current handling for SPIR-V memory semantics is very specific to
the wording in the SPIR-V spec, which breaks its handling of OpenCL
(compared to what we had working downstream before merging upstream).
Update/relax the logic here to support CL's barrier(CLK_GLOBAL_MEM_FENCE);
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10165>
MSAA 4x and 8x should only clear the first 2 samples because other samples
are uncompressed. The compute shader only clears that subset of DCC.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10003>
The retile map is removed and replaced by direct DCC address computations
in the retile shader using the new function ac_nir_dcc_addr_from_coord.
The RADV code is disabled.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10003>
The test takes over 2 minutes on a 12C/24T CPU with OpenMP.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10003>
This fixes an addrlib failure on gfx9.
Fixes: b43f40166c "ac/surface: select best swizzle mode for 3D sampler performance"
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10003>
This adds a clear_buffer compute shader that does read-modify-write to
update a subset of bits in HTILE.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10003>
Set the final value in si_texture_create_object, so that other places
don't have to derive it redundantly.
The only thing to remember is that HTILE stencil can be enabled when
stencil is not present, and it can be disabled when stencil is present
due to various workarounds.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10003>
Fast clears are only used for level 0. This enables clearing level 0
of CMASK and DCC on gfx10+ when there are multiple mipmap levels.
vi_dcc_clear_level can also clear any level now.
Mipmapped array textures are still cleared slowly.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10003>
FMASK is usually pretty large. It's better to leave the cache to shaders.
FMASK stores are still cached, but they can be evicted sooner, which is
the same as other color stores. Only DCC, HTILE, and CMASK are cached.
I haven't benchmarked this, but it seems like the right thing to do.
This only affected APUs.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10003>
With a resolution of 1600x1200, I measured FPS increases in:
* glxgears 18.04% +/- 0.65% (n=691)
* Nexuiz 3.58% +/- 0.09% (n=553)
compared to the master branch at commit
3f614c6f7c.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9230>
Pull in the header from drm-next commit
32c3d9b0f51ee1e6bb0160496b97e50b5caca4d0. Among other things, this
brings in the I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS_CC modifier.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9230>
bo_free is called on external BOs when there are no objects left which
reference them. The function unmaps the address range associated with
any maps which occured. However, if the BO is busy (not idle), it
doesn't mark the pointer to the start address as invalid. This can lead
to a segfault later on.
At the end of bo_free, these BOs are still present in the handle hash
table. If such a BO is reused (i.e., when a DMABUF with the same handle
is reimported) and the driver attempts to get another mapping, the
bufmgr will incorrectly assume that the map pointer is still valid and
reuse it. This leads to a segfault. Set the pointer to NULL to mark it
as invalid.
Enables iris to run and pass the piglit test,
ext_image_dma_buf_import-reimport-bug.
Cc: mesa-stable
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9230>
this info is important to have for a given frame, but it requires that the base
structs be copied and stored to the trace context for later use
Acked-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10093>
When crosvm does not support venus, it still advertises
VIRGL_RENDERER_CAPSET_VENUS but provides no or zeroed capset data.
vk_xml_version will be zero.
It is a good idea to verify vk_xml_version anyway.
v2: print required version suggested by Ryan
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10192>
Previously panfrost_batch_add_bo was called MAX_MIP_LEVELS times on
the same batch.
Fixes: cbf68b21fb ("panfrost: Move checksum_bo to panfrost_resource")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10158>
Because the VCN encoder needs the surface to be memory aligned, the
resolution of the image passed to the encoder might be larger and have
extra padding added - this change crops the resulting output to
compensate for the extra padding that might have been added.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4559
Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10137>
Otherwise we may hit tile heap size limit if an app issues too many
draws per job.
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10121>