Add negative value is possible to wrap around. I haven't seen this
"nuw" causes any problem yet, but let's remove it for safe.
Fixes: 60ac5dda82 ("ac: Add NIR lowering for NGG GS.")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19718>
We should not use "nuw" here as negative add positive may wrap
around (negative is 0xffffff??).
This problem can be observed with LLVM15 (I can't see when LLVM14):
%.neg = mul nsw i32 %31, -4
%163 = add nuw nsw i32 %.neg, 16
%164 = lshr i32 257, %.neg
%165 = lshr i32 %164, %163
LLVM just assume %.neg is possitive, so pre-shift 0x01010101 by 16.
This get wrong value because we can't get back the shifted bits with
a negative shift right.
Fixes: 75dbb40439 ("ac/nir: Remove byte permute from prefix sum of the repack sequence.")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19718>
In 23c7142cd6 ("anv: disable SIMD16 for RT shaders") we were forcing the SIMD8
using the mechanism for subgroup size control, which is problematic since it has
other effects on the shader behavior.
The code was changed to select the SIMD in a different way in the previous patches,
so we can revert the behavior to the original semantics.
Fixes dEQP-VK.subgroups.builtin_var.ray_tracing.subgroupsize.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19601>
We still update the cs_prog_data, but don't rely on it for this state anymore.
This will allow use the SIMD selector with shaders that don't use cs_prog_data.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19601>
This is a preparation to decouple the storage of what SIMDs
compiled/spilled from the cs_prog_data. This will allow reuse
of SIMD selection code by Bindless Shaders.
And since we have a struct now, move the error array there so
reduce the boilerplate of the users.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19601>
We don't intend to expose neither to drivers, so it is fine to be C++.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19601>
This fixes GALLIVM_DEBUG=asm for compute shaders, changing
the hooks after dumping causes a segfault because the
memory has already been finalised. Just add the hooks always,
and before dumping anything.
Fixes: f511d2a553 ("gallivm: rework coroutine malloc/free callouts.")
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19714>
Direct3D and Vulkan's robustBufferAccess2 feature mandate that index
buffer out-of-bounds reads should return a zero index (ie, vertex at
index zero, not to be confused with a vertex with zero attributes, as
the kind resulting in vertex buffer out-of-bounds read.)
lavapipe was adding index_offset and start index together without
overflow checks, and if start index was sufficient large (as is the case
with WHCK wgf11draw which sets start index to (UINT)-5) it would cause
to wrap around causing fetches that should be out of bounds wrap around
and fetch inside bounds.
This change fixes this by doing a clamped add. This ensures start index
is set to UINT32_MAX on overflow, which is sufficient in practice to
trigger draw index OOB code-paths, yield zero index to be returned.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19683>
So it can be used outside draw. Also drop the overflow_value parameter,
as it wasn't meaningfully useful.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19683>
These happen when shader cache is disabled (ENABLE_SHADER_CACHE
undefined) due to a prototype mismatch.
Also remove redundant return statements.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19683>
add the logic for calculating film grain related
coefficients for VCN to generate film grain output.
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19660>
The usual reasons: Flake handling, familiar skips/xfails handling, faster
parallelization. This also sets us up for running a subset of the CL CTS
once we decide to build it in our containers.
Acked-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19608>
Fixes a vtn crash with
dEQP-VK.ray_tracing_pipeline.pipeline_library.configurations.singlethreaded_compilation.s0_l1
and validation enabled.
Closes: #7642
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19700>
The helper used ralloc which is unusual for vulkan objects, did not
handle allocation failures properly and was only useful for RADV.
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19700>
The code builds up the dynamic array of objects (spirv_objs) and
collect pointers to each of them into another dynamic
array (spirv_ptr_objs).
If the growth of the first array cause a reallocation, it is
possible that the previous pointers end up invalid.
Fixes: 77e929a527 ("intel/clc: allow multiple CL files to be compiled together")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19730>
After 'v3dv: fix debug dump on BO free' we changed the order, and this
lead to the following test
dEQP-VK.api.object_management.multithreaded_per_thread_resources.device_memory_small
v2: Expanded comment just before the reset, explaining that we need to
do the reset before we free the BO from the kernel (Iago)
Raising this assertion:
deqp-vk: ../src/broadcom/vulkan/v3dv_bo.c:281: v3dv_bo_alloc: Assertion `bo && bo->handle == 0' failed.
Fixes: 2c44597181 ('v3dv: fix debug dump on BO free')
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19693>
This reverts commit cb02cf464c.
There are 3 reported flakes over a period of a month, and we have been
unable to reproduce it even once. It clearly doesn't happen often enough
to warrant disabling our vulkan CI, so let's restore it while we
continue to try to reproduce the issue on our side.
Signed-off-by: Eric Engestrom <eric@igalia.com>
Acked-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19720>
We only support lazy descriptors these days, so having the
infrastructure around to support automatic selection of that one mode is
kinda silly.
And it's not like setting an environment variable that is never read is
going to cause any issues, so we don't even need this to avoid breaking
existing setups.
Let's just rip it out. We can reintroduce it again on the off-chance
that someone has a new clever descriptor mode they want to experiment
with.
Reviewed-by: Hoe Hao Cheng <haochengho12907@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19670>
VGPR allocation changed on GFX11 and this might have changed how
shared VGPRs work, so it's probably more secure to lower in NIR.
Cc: 22.3 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19679>
This is a port of the fp pair regalloc. It is however much simpler as
contrary to the fp, we don't have to care about texturing, we can use
any swizzle we want and we don't have to track the inputs. Using the
register class machinery might actually be a slight overkill right now,
however the infrastructure will hopefully come in handy if someone
decides to implement the vp pair scheduling eventually.
Shader-db stats:
RV530:
total temps in shared programs: 18594 -> 17000 (-8.57%)
temps in affected programs: 5753 -> 4159 (-27.71%)
helped: 665
HURT: 0
RV370:
total temps in shared programs: 13555 -> 12181 (-10.14%)
temps in affected programs: 5116 -> 3742 (-26.86%)
helped: 633
HURT: 0
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5972
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Tested-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19618>