The base mask previously used was 0xffffffff. This is not correct (but
should still work) for 16-bit and 8-bit values, but it means the high
32-bits of 64-bit values will get chopped off.
Instead of just restricting the pattern to 32-bits (as was done before
00b28a50b2), this extends the optimization in two ways:
1. Make it correct for other bit sizes.
2. Make it work for arbitrary shift counts.
This has the added benefit of reducing the number of patterns actually
added (7 previously, 4 now).
The "Reassociate for improved CSE" part is just reverted to its
pre-00b28a50b2c behavior. I doubt that pattern is likely to have much
impact outside 32-bits.
This change fixes the piglit tests
tests/spec/arb_gpu_shader_int64/fs-shl-of-shr-int64.shader_test and
tests/spec/arb_gpu_shader_int64/fs-iand-of-iadd-int64.shader_test.
All of the shaders helped in shader-db are vertex shaders on platforms
with vector-oriented vertex processing. The shaders contain ((x >> 16)
<< 16). These platforms set lower_extract_word, so the optimization
that transforms (x >> 16) to extract_u16 doesn't trigger. With only ~60
shaders involved, I didn't bother trying to add extract_XYZ versions of
these patterns to try to get those cases.
Fixes: 00b28a50b2 ("nir/algebraic: trivially enable existing 32-bit patterns for all bit sizes")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Haswell and earlier Intel GPUs had simlar results. (Haswell shown)
total instructions in shared programs: 16397554 -> 16397496 (<.01%)
instructions in affected programs: 7961 -> 7903 (-0.73%)
helped: 58
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.36% max: 1.89% x̄: 0.99% x̃: 0.78%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -1.13% -0.85%
Instructions are helped.
total cycles in shared programs: 1035483770 -> 1035483504 (<.01%)
cycles in affected programs: 75922 -> 75656 (-0.35%)
helped: 44
HURT: 2
helped stats (abs) min: 2 max: 12 x̄: 6.14 x̃: 2
helped stats (rel) min: 0.05% max: 1.67% x̄: 0.87% x̃: 0.72%
HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel) min: 0.06% max: 0.06% x̄: 0.06% x̃: 0.06%
95% mean confidence interval for cycles value: -7.28 -4.29
95% mean confidence interval for cycles %-change: -1.03% -0.63%
Cycles are helped.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8852>
(cherry picked from commit 6b0443a900)
The hardware can only scan-out linear buffers with a pitch
aligned to 256. It can only use packed buffers for cursors.
Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8500>
(cherry picked from commit a4c11385b7)
As it turns out, the wait is required as the driver expects for
rendering to be quiesced on exit. This can trigger channel failures,
which in turn trigger recovery. This can fail and destroy the whole
system.
Fixes: 28a781323f ("nouveau: change fence destruction logic on screen destroy")
References: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4223
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8867>
(cherry picked from commit 92f12952f3)
this must be reset to avoid issues when using VK_DESCRIPTOR_BINDING_PARTIALLY_BOUND_BIT_EXT
when some descriptors in the set may not have been bound
fixes#4219
Fixes: 126d5adb11 ("radv: Use host memory pool for non-freeable descriptors.")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8840>
(cherry picked from commit 09ce403b2d)
failing to unset any existing pointers here leads to stale bo entries in
the list and then the kernel rejecting the cmdbuf with ENOENT
Fixes: 126d5adb11 ("radv: Use host memory pool for non-freeable descriptors.")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8840>
(cherry picked from commit 2f534c2e2e)
The polygon list is written by tiler jobs and read by fragment ones,
and nothing should re-use the heap until the fragment job is done.
4fec6c9448 ("panfrost: Add the tiler heap to fragment jobs") fixed
this for the !multi-context case by adding the heap BO to fragment job.
But the tiler heap is shared accross contexts, and vertex/tiler+fragment
job submission is done through 2 separate ioctls, meaning that
vertex/tiler and fragment jobs from 2 different context might be
interleaved.
Add a lock at the device level to ensure tiler/vertex+fragment jobs are
submitted sequentially, with no other jobs using the same tiler heap
in-between.
Cc: mesa-stable
Fixes: d8deb1eb6a ("panfrost: Share tiler_heap across batches/contexts")
Reported-by: Icecream95 <ixn@disroot.org>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Icecream95 <ixn@disroot.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8822>
(cherry picked from commit 66125c429f)
If LIBGL_ALWAYS_SOFTWARE is set, then drisw is selected, and internally,
drisw should choose one of the actual software drivers. If it's not set,
but drisw is still selected (no hardware DRM driver, like in WSL), then
layered drivers are preferred over pure software.
Fixes: 4a3b42a7 ("drisw: Prefer hardware-layered sw-winsys drivers over pure sw")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4171
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8782>
(cherry picked from commit a88cd98315)
This goes down the list and picks the first non-cpu device, when
we merge the CI patch we should add a forcing env var in here.
Fixes: 8d46e35d1 ("zink: introduce opengl over vulkan")
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8845>
(cherry picked from commit e41b0202c9)
Previously, we would set WGP_MODE on GFX10+ and then only on GFX10.
Because we used bitwise or, the result was WGP_MODE being set on GFX10+.
We also set the wrong bit, S_00B848_WGP_MODE instead of S_00B228_WGP_MODE.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8811>
(cherry picked from commit 2338e4ad36)
Vertex attribute bounds checking is supposed to be done per-attribute:
is_oob = index * stride + attrib_offset + attrib_size > buffer_size
but we were obtaining num_records by dividing the buffer size by the
stride, making it per-vertex:
is_oob = index * stride + (stride - 1) >= buffer_size
An example from Dead Cells (Wine) is:
attribute bindings: 0, 1, 2
attribute formats: r32g32, r32g32, r32g32b32a32
attribute offsets: 0, 0, 0
binding buffers: all the same buffer
binding offsets: 0, 8, 16
binding sizes: 128, 120, 112
binding strides: 32, 32, 32
Workaround this issue without switching to per-attribute descriptors by
rounding up the division. This is still incorrect, but it should now no
longer consider in-bounds attributes out-of-bounds.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3796
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4199
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8835>
(cherry picked from commit 56cd79b63d)
2f1947b39c ("panfrost: Fix tiler job injection") had the tests
inverted: WRITE_VALUE jobs are only needed on Midgard, not Bifrost.
Cc: mesa-stable
Fixes: 2f1947b39c ("panfrost: Fix tiler job injection")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8808>
(cherry picked from commit ec6c6f610c)
This is needed to use the new dispatch layer code. While we're here, we
clean up the context on the error path.
Fixes: 9b1138e3f0 "radv: implement VK_EXT_private_data"
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8676>
(cherry picked from commit f695957421)
There was only one caller, an error path in mesa/st. But this is now
incorrect as we need align_free(). Just remove it.
Fixes: 55e853d823 ("mesa/st: Allocate the gl_context with 16-byte alignment.")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8801>
(cherry picked from commit 824ae64477)
The _ModelProjectMatrix matrix embedded inside has members inside of it
marked as 16-byte aligned, and so the context also has to be 16-byte
aligned or access to those members would be invalid. I believe the
compiler used this to use better 16-byte-aligned load/stores to other
members of the context, breaking when the context's alignment was only 8
(as normal mallocs guarantee).
Fixes: 3175b63a0d ("mesa: don't allocate matrices with malloc")
Tested-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8783>
(cherry picked from commit 55e853d823)
C's definition of the % operator has a footgun around sign conversion.
Avoid it and just use bitwise arithemtic instead like the hardware
would, fixing the disassembly and making buggy assembly more obvious.
Fixes: 08a9e5e3e8 ("pan/bi: Decode M values in disasm")
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8723>
(cherry picked from commit a69c73988b)
Single-line version of MSVC warning suppression does not extend beyond
the #endif directive. Use push/disable/pop instead.
Also suppress 26452, which is a similar analysis warning.
This could also be fixed with constexpr if, but C++17 would be required.
Fixes: 790516db0b ("gallium/swr: fix gcc warnings")
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8093>
(cherry picked from commit 3c7062417b)
It was already parsed in intelInitScree2, and the results are stored in
the screen.
Fixes: d67ef48580 ("i965/screen: Allow drirc to set 'allow_rgb10_configs' again.")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7387>
(cherry picked from commit 0f1a8f8a6d)
This has no effect on other shading. It should have been the default value.
Fixes: c3432ad852 - radeonsi: add an option to enable 2x2 coarse shading for non-GUI elements
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8726>
(cherry picked from commit dbf09c0c26)
Disable compression in iris_flush_resource if the resource lacks a
modifier. When a caller wants to prepare such a resource for sharing
(via eglCreateImage for example), this change enables all reference
holders to access the resource in a common manner - without compression.
This fixes misrendering with 3D-accelerated qemu. A piglit test which
reproduces qemu's behavior, ext_image_dma_buf_import-export-tex, is also
enabled to pass.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2678
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8663>
(cherry picked from commit 40d6b92de9)
Some drivers need to be able to remove compression from resources before
they are handed to consumers that wouldn't understand or expect it.
Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8663>
(cherry picked from commit b26f510978)
Many entries in the dri2_format_table have _DRI_IMAGE_FORMAT_NONE as the
dri_format. Make the result of dri2_get_mapping_by_format a tad more
well-defined by returning NULL when this format is passed into it.
Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8663>
(cherry picked from commit 0a8cc88202)
It is currently a bitset on top of a uint64_t but there are already
more than 64 values. Change to use BITSET to cover all the
SYSTEM_VALUE_MAX bits.
Cc: mesa-stable
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8585>
(cherry picked from commit 9f3d5e99ea)
On Fiji, the CTS image can cause a hang when these are enabled.
Let's enable them for Polaris and newer only, for now.
Gitlab: #4136
Fixes: 9f43b44bf0
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8646>
(cherry picked from commit 3c03fa5801)
It could happen that due to inconsistent copy-propagation
v1 = p_parallelcopy v2b
instructions were left after optimization on GFX8.
Cc: 20.3
Cc: 21.0
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8260>
(cherry picked from commit 856fd4750d)