This fixes a hang in shadertoy for radeonsi where a buffer was initialized with:
value -= value
with value being undefined.
In this case LLVM replace the operation with an assignment to NaN.
Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111241
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 47cc660d9c)
See the previous commit for the explanation of the Fixes tag.
Hurts 21 shaders in shader-db. All of the hurt shaders are in Unreal
Engine 4 tech demos.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: 7afa26d4e3 ("nir: Add lowering for nir_op_bitfield_reverse.")
(cherry picked from commit b418269d7d)
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
[Juan A. Suarez: resolve trivial conflicts]
Conflicts:
src/intel/compiler/brw_compiler.c
This caused a problem on Sandybridge where an open-coded
bitfieldReverse() function could be optimized to a
nir_op_bitfield_reverse that would generate an unsupported BFREV
instruction in the backend. This was encountered in some Unreal4 tech
demos in shader-db. The bug was not previously noticed because we don't
actually try to run those demos on Sandybridge.
The fixes tag is a bit a lie. The actual bug was introduced about
26,000 commits earlier in 371c4b3c48 ("nir: Recognize open-coded
bitfield_reverse."). Without the NIR lowering pass, the flag needed to
avoid the optimization does not exist. Hopefully nobody will care to
fix this on an earlier Mesa release.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: 7afa26d4e3 ("nir: Add lowering for nir_op_bitfield_reverse.")
(cherry picked from commit d3fd1c761a)
src0 vstride and type overlap with bits of the extended descriptor.
brw_set_desc() also sets the extended descriptor to 0. So by setting
the descriptor, then setting src0, we were accidentally setting a bunch
of extended descriptor bits unintentionally.
When using this infrastructure for framebuffer writes (in a future
patch), this ended up setting the extended descriptor bit 20, which is
"Null Render Target" on Icelake, causing nothing to be written to the
framebuffer.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit c8c9c48684)
This fixes the following CTS test on 32-bit systems:
GTF-GL46.gtf30.GL3Tests.packed_depth_stencil.packed_depth_stencil_init
It does glGetTexImage of a 16-bit SNORM image, requesting 32-bit UNORM
data. In get_tex_rgba_uncompressed, we round trip through float to
handle image transfer ops for clamping. _mesa_format_convert does:
_mesa_float_to_unorm(0.571428597f, 32)
which translated to:
_mesa_lroundevenf(0.571428597f * 0xffffffffu)
which produced different results on 64-bit and 32-bit systems:
64-bit: result = 0x92492500
32-bit: result = 0x80000000
This is because the size of "long" varies between the two systems, and
0x92492500 is too large to fit in a signed 32-bit integer. To fix this,
we switch to the new _mesa_i64roundevenf function which always does the
64-bit operation.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104395
Fixes: 594fc0f859 ("mesa: Replace F_TO_I() with _mesa_lroundevenf().")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit e18cd5452a)
This always returns a int64_t, translating to _mesa_lroundevenf on
systems where long is 64-bit, and llrintf where "long long" is needed.
Fixes: 594fc0f859 ("mesa: Replace F_TO_I() with _mesa_lroundevenf().")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit b59914e179)
Looks like a copy/paste error. This patch prevents a segfault when
running the following on BDW:
INTEL_DEBUG=no8,no16,do32 ./deqp-vk -n \
dEQP-VK.subgroups.arithmetic.compute.subgroupmin_dvec4
For the curious, the message we're getting is:
CS compile failed: Failure to register allocate. Reduce number
of live scalar values to avoid this.
Fixes: 864737ce6c ("i965/fs: Build 32-wide compute shader when needed.")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
(cherry picked from commit 848d5e444a)
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
[Juan A. Suarez: resolve trivial conflicts]
Conflicts:
src/intel/compiler/brw_fs.cpp
Whenever a buffer is allocated, e.g. by the first draw call or EGL call after a
buffer swap, make sure the size is up to date. Prior to this commit, we
failed to do so when querying the buffer age, or swapping buffers
without any prior EGL call or draw call.
Signed-off-by: Jonas Ådahl <jadahl@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 903ad59407)
Make sure we read the updated data from the gpu in cases where WAIT_BIT
is not set.
Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit a410823b3e)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Conflicts:
src/amd/vulkan/radv_query.c
...by copying the implementation of anv_get_absolute_timeout().
Appears to fix a CTS test with 32-bit builds:
GTF-GL46.gtf32.GL3Tests.sync.sync_functionality_clientwaitsync_flush
Fixes: f459c56be6 ("iris: Add fence support using drm_syncobj")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
(cherry picked from commit 7ee7b0ecbc)
Fixes errors seen with eglSetBlobCacheFuncsANDROID on Android when
running dEQP that terminates and reinitializes a display.
Fixes: 6f5b57093b "egl: add support for EGL_ANDROID_blob_cache"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 3e03a3fc53)
Fixes: The following commit depends on commits 77a1070d36 and
df4c2ec5e1 in order to compile, which did not land in the branch.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
This doesn't work for compressed formats, as the source texture and
temporary texture would have different block sizes. (Forcing the driver
to always take the GPU path would expose the bug.) Instead, just use
the source format for the temporary, and let blorp_copy deal with
overrides.
The one case where we can't do this is ASTC, because isl won't let us
create a linear ASTC surface. Fall back to the CPU paths there for now.
Fixes: 9d1334d2a0 ("iris: Use copy_region and staging resources to avoid transfer stalls")
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
(cherry picked from commit 136629a1e3)
Fixes: This commit does not apply cleanly on 19.1 branch, as it depends
on other commits not present in the branch.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
For renderable surfaces, we allocate SURFACE_STATEs for each bit in
res->aux.possible_usages. Sampler views use res->aux.sampler_usages.
When pinning buffers, we call surf_state_offset_for_aux() to calculate
the offset to the desired surface state. surf_state_offset_for_aux()
took an aux_modes parameter, which should be one of those two fields.
However...it was not using that parameter. It always used the broader
res->aux.possible_usages field directly.
One of the callers, update_clear_value(), was passing incorrect masks
for this parameter. It iterated through the bits in order, using
u_bit_scan(), which destructively modifies the mask. So each time we
called it, the count of bits before our selected mode was 0, which would
cause us to always update the SURFACE_STATE for ISL_AUX_USAGE_NONE,
rather than updating each in turn. This was hidden by the earlier bug
where surf_state_offset_for_aux() ignored the parameter.
Fixes: 7339660e80 ("iris: Add aux.sampler_usages.")
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
(cherry picked from commit 117a0368b0)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Conflicts:
src/gallium/drivers/iris/iris_state.c
Fixes: This commit does not apply cleanly on 19.1 branch, as it depends
on other commits not present in the branch.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
The compute paths in vl are a bit AMD-specific. For example, they (on
nouveau), try to use a BGRX8 image format, which is not supported.
Fixing all this is probably possible, but since the compute paths aren't
in any way better, it's difficult to care.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213
Fixes: 9364d66cb7 (gallium/auxiliary/vl: Add video compositor compute shader render)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 958390a9bf)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Conflicts:
src/gallium/auxiliary/util/u_screen.c
src/gallium/docs/source/screen.rst
src/gallium/drivers/radeonsi/si_get.c
src/gallium/include/pipe/p_defines.h
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Fixes: 414148cdc1 "nir: Support deref instructions in loop_analyze"
(cherry picked from commit 204846ad06)
This only appears to happen on Raven2.
Possible way to reproduce:
resource_get_handle(WINSYS_HANDLE_TYPE_KMS) --> sets is_shared = true
resource_get_handle(WINSYS_HANDLE_TYPE_DMABUF) --> fail
Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8d0d753bd0)
There is an object-level preemption workaround which requires this.
However, even without object-level preemption, we seem to have issues
with geometry flickering when 3D and compute are combined in the same
batch and this appears to fix it.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110395
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit b8842bc312)
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 23a9d20997)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Conflicts:
src/amd/vulkan/radv_pipeline.c
Fixes: 759b940389 ("util: Get program name based on path when possible")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 5b10ddf358)
program_invocation_name and program_invocation_short_name are both GNU
extensions. I don't believe one can exist without the other, so only
check for program_invocation_name.
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit c9b86cf526)
Since it can introduce comparisons.
Fixes: 028ce52739 "radv: Add non-uniform indexing lowering."
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 2301b2e029)
Fixes: This commit does not apply cleanly on 19.1 branch, as it depends
on other commits not present in the branch.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Test-case with depth-clear 0.5 and format
MESA_FORMAT_Z24_UNORM_X8_UINT fails due inconsistent
clear-value of 0.4999997.
Maybe its better to improve?
CC: Jason Ekstrand <jason.ekstrand@intel.com>
Fixes: 0ae9ce0f29 (i965/clear: Quantize the depth clear value based on the format)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111113
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit a86eccfb78)
This fixes problems spotted within vk-gl-cts. Problem is that the builtin
functions refer to types and we should not release types before builtins
are released.
Fixes: 624789e370 ("compiler/glsl: handle case where we have multiple users for types")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110796
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Specifically the optimization of a conditional BREAK + WHILE sequence
into a conditional WHILE seems pretty broken. The list of successors
of "earlier_block" (where the conditional BREAK was found) is emptied
and then re-created with the same edges for no apparent reason. On
top of that the list of predecessors of the block immediately after
the WHILE loop is emptied, but only one of the original edges will be
added back, which means that potentially several blocks that still
have it on their list of successors won't be on its list of
predecessors anymore, causing all sorts of hilarity due to the
inconsistency in the control flow graph.
The solution is to remove the code that's removing valid edges from
the CFG. cfg_t::remove_block() will already clean up after itself.
The assert in bblock_t::combine_with() also needs to be removed since
we will be merging a block with multiple children into the first one
of them.
Found the issue on a hardware enabling branch originally, but
apparently somebody reproduced the same problem independently on
master in the meantime.
Fixes: d13bcdb3a9 ("i965/fs: Extend predicated break pass to predicate WHILE.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111009
Cc: jiradet.jd@gmail.com
Cc: Sergii Romantsov <sergii.romantsov@globallogic.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Tested-by: Paul Chelombitko <qamonstergl@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 54fbc625ea)
These were left after a rebase and happen to make
NIR_INTRINSIC_SWIZZLE_MASK == NIR_INTRINSIC_SRC_ACCESS, which is how it
was noticed.
Fixes: 6f20643b47 ("nir: Allow qualifiers on copy_deref and image instructions")
Cc: Connor Abbott <cwabbott0@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 5d7bcac4e7)
Currently, if we error out before gbm_dri is set (say due to a different
name of the backing GBM implementation, or otherwise) the tear down will
trigger a NULL ptr deref and crash out.
Move the gbm_dri initialization as early as possible.
v2: Drop check in dri2_teardowm_drm (Eric)
Reported-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 72b97ad9b2)
This fixes dEQP-VK.subgroups.quad.compute.subgroupquadswaphorizontal_*
on all gen7 platforms.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 8fd2f2c276)