For formats with 32-bit channels (R32F, R32I, R32UI, etc.), the
expression (1 << channel_size) - 1 is undefined behavior in C when
channel_size is 32, as shifting a 32-bit integer by 32 bits overflows.
On most platforms this produces mask=0, resulting in BLT.CLEAR_BITS=0x0
which tells the hardware to write no pixel data during clear operations.
Use 1ull to perform the shift in 64-bit, correctly producing 0xFFFFFFFF
for 32-bit channels.
Fixes dEQP-GLES3.functional.fbo.blit.default_framebuffer.r32f_nearest_out_of_bounds_blit_from_default
Fixes: c156da579c ("etnaviv: blt: Enable masked clear for color and stencil")
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Daniel Lang <dalang@gmx.at>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39949>
fossil-db (navi31):
Totals from 2 (0.00% of 84369) affected shaders:
Instrs: 7738 -> 7740 (+0.03%)
Latency: 333207 -> 333239 (+0.01%)
InvThroughput: 33320 -> 33324 (+0.01%)
VClause: 382 -> 384 (+0.52%)
VMEM: 656 -> 658 (+0.30%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 4ca7ee7bd7 ("nir/opt_load_store_vectorize: Allow to vectorize at most one entry of each type across blocks")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14825
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39849>
This would fix both stores 'b' and 'c' from being vectorized:
a = load(0)
loop {
b = load(0)
if (break)
store(0)
}
c = load(0)
fossil-db (navi31):
Totals from 8 (0.01% of 84369) affected shaders:
Instrs: 12035 -> 12066 (+0.26%)
CodeSize: 63016 -> 63208 (+0.30%)
Latency: 176091 -> 177013 (+0.52%)
InvThroughput: 43894 -> 43981 (+0.20%)
SClause: 194 -> 196 (+1.03%)
Copies: 803 -> 812 (+1.12%)
VALU: 7666 -> 7675 (+0.12%)
SALU: 1102 -> 1105 (+0.27%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 4ca7ee7bd7 ("nir/opt_load_store_vectorize: Allow to vectorize at most one entry of each type across blocks")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14825
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39849>
Move MSAA sample count validation into a gpu_supports_msaa() helper
and call it early in etna_screen_is_format_supported(). Previously,
the MSAA checks were only done for render targets inside
gpu_supports_render_format(), so depth/stencil formats with
unsupported sample counts were incorrectly reported as supported.
Fixes dEQP-GLES3.functional.fbo.msaa.2_samples.depth24_stencil8
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Daniel Lang <dalang@gmx.at>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39946>
The current approach of explicitly saving/restoring some states is
unnecessarily complicated and inefficient. For example, some meta OPs
that use memory fills/copies will have nested save/restores. This patch
is the first step towards avoiding unnecessary state re-emits around
meta OPs.
The changes are:
- Move radv_meta_saved_state to radv_cmd_buffer::state
- Add radv_meta_begin/end helpers that initialize radv_meta_saved_state
and restore states used by the meta OP
- Remove all explicit saves/restores, use the new helpers
radv_meta_begin/end is called inside the entrypoint and not some nested
helper function which means that state is only restored once per meta
OP.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39774>
The issue caused us to put a switch to disable (Xe2) drm modifers
in 2418c91537 is fixed in GTK 4.20.3,
so we can enable the modifiers with this and newer GTK releases.
GTK https://gitlab.gnome.org/GNOME/gtk/-/merge_requests/9164:
b2a42d5a6e Revert "vulkan: Wait for device to be idle before
create/recreating swapchain"
270735a151 vulkan: Rework swapchain present implementation
The hex values represent the GTK version range: [4.0.0, 4.20.2] for
VK_MAKE_VERSION(), refer to:
f493f5c88d
Cc: mesa-stable
Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39223>
With TU_DEBUG=gmem,3d_load seen at least in: "Industria" and "NieR Replicant"
TU_DEBUG=noconcurrentresolves also prevent the issue.
We have to wait until all CP_EVENT_WRITE::BLIT are completed,
otherwise writing to depth image as color confuses HW.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39236>
VK_FORMAT_{R8G8B8,B8G8R8}_{UNORM,SRGB} describe a 3-component, 8bpc,
24bpp, format. This is mapped to that type for Android, and implemented
as such by panvk. radv maps these to 4-component/32bpp formats, but only
support these formats for buffers rather than images. The outlier is
ANV, which relies on the 24->32bpp mapping to happen.
The Wayland WSI was mapping this to the 32bpp R8G8B8A8/B8G8R8A8 formats
instead. This would cause a failure to import the dmabuf into the
compositor on panvk, as it would send a buffer which was too small. (Or,
if it did import: garbage.)
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39552>
Adjust or fill out various properties for the a830 GPU, setting up a gen1
base. So far these mostly mirror the gen2 properties, except for gmem
config layouts, and they will probably further diverge down the line.
A new GPU ID for a830 is also added, Turnip there runs on top of KGSL.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39874>
With a8xx a lot of chicken bit and other device-specific magic register
handling has moved into the kernel, which leaves a list of register writes
that could be more commonly shared between all a8xx devices.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39874>
This just bit me. Add an assert to catch the next person who doesn't
read the function signature and tries to extract 64-bits out and wonders
why things are silently broken.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39892>
Match other SSBO intrinsics and other atomics.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39895>
dzn_physical_device_create runs dzn_wsi_init which allocates and initilizes wsi_interfaces.
If the wsi initlizalization fails, the wsi_interfaces are cleaned up and freed (wsi_common.c:330-> wsi_device_finish).
Once the failure propogates up, dzn_physical_device_create runs dzn_physical_device_destroy.
Then, dzn_wsi_finish frees the wsi_interfaces again.
Above path led to a Segmentation Fault on my system when running:
dEQP-VK.api.device_init.create_instance_device_intentional_alloc_fail.basic
This change removes dzn_wsi_finish call if dzn_wsi_init fails, avoiding
a "double free"
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39928>
ac_prepare_cs_clear_copy_buffer determines whether to use CP DMA, and
the driver obeys that.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39841>
This should make copying sparse faster if we get aligned buffer bounds.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39841>
This commit extracts the third and final variant of function
anv_get_image_format_features2(). It is still a 296-line function, but
that is already significantly smaller than the 444-line behemoth that
anv_get_image_format_features2() was at the start of this patch
series.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39840>
Function anv_get_image_format_features2() has 3 clear subvariants that
take paths independent of each other: one for compressed_emulated
formats, another for depth/stencil formats, and a third one for color
formats. Extract the 2 first subvariatns to their own sub-functions.
We'll extract the color variant in the next commit in order to make
the diff easier to review.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39840>
A 76-line chunk of code just to decide if the format is supported,
let's move it to its own function.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39840>
It's redundant information, as it's already part of struct anv_format.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39840>