When I copied and pasted the code from MOV_INDIRECT for handling the
dependency controls, I missed a subtle difference between MOV_INDIRECT
and SHUFFLE. Specifically, MOV_INDIRECT gets lowered to a narrow
instruction on Gen7 by the SIMD width lowering whereas SHUFFLE has to
split it in the generator. Therefore, the check safety check for
whether or not we can use dependency control has to be based on the
lowered width rather than the width of the original instruction.
Fixes: a8ac61b0ee "intel/fs: NoMask initialize the address..."
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3593
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6989>
(cherry picked from commit 8427e56067)
The volatile pattern gives me flaky results for 32-bit builds on
ChromeOS Android. This is because on 32-bit the volatile 64-bit
loads gets split into 2 32-bit loads each.
So if we read the lower dword first and then the upper dword, it
can happen that the upper dword is already changed but the lower
dword isn't yet. In particular for occlusion queries this gives
false readings, as the upper dword commonly only constains the
ready bit.
With the GCC atomic intrinsics we get a call to __atomic_load_8
in libatomic.so which does the right thing.
An alternative fix would be to explicitly split the 32-bit loads
in the right order and do a bunch of retries if things change, though
that gets messy quickly and for 32-bit builds only doesn't feel worth
it that much.
CC: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6933>
(cherry picked from commit 7568c97df1)
Without this, it was checking bit size compatibility with bit sizes such
as 96 which is clearly invalid.
No shader-db changes on Ice Lake
Fixes: ce9205c03b "nir: add a load/store vectorization pass"
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6871>
(cherry picked from commit 57e7c5f05e)
Do not throw a deprecation warning if the power8 option is set to the
new 'disabled' value. Instead, warn if it is still set to the legacy
value 'false'.
Fixes: 138c003d22 ("meson: deprecated 'true' and 'false' in combo options for 'enabled' and 'disabled'")
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6370>
(cherry picked from commit 03bea54e02)
The linker was adding all state vars as uniforms, doubling the storage size
for shaders using only builtin uniforms, which increased CPU overhead for
constant buffer uploads.
When this code was originally ported from the GLSL IR linker we forgot
to exclude builtins because the check was not done in the
add_uniform_to_shader class but rather a check was done when passing
variables to this class for processing.
Fixes: 664e4a610d ("glsl/nir: Fill in the Parameters in NIR linker")
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Tested-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6958>
(cherry picked from commit 038fcbcaed)
Workaround # 22011374674
Applied to i965, iris and anv drivers
No performance impact is observed with WA.
Cc: mesa-stable
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 545d852a7a)
After disable SDMA on Arcturus(gfx9), dead lock with aux_context_lock is
detected since si_screen_clear_buffer is called recursively before
release lock.
The call trace is:
si_clear_render_target->si_compute_clear_render_target->
si_launch_grid_internal->si_launch_grid->si_emit_cache_flush->
si_prim_discard_signal_next_compute_ib_start->u_suballocator_alloc->
si_resource_create->si_buffer_create->si_alloc_resource->
si_screen_clear_buffer->simple_mtx_lock->
si_sdma_clear_buffer->si_pipe_clear_buffer->
si_clear_buffer->si_compute_do_clear_or_copy->
si_launch_grid_internal->si_launch_grid->si_emit_cache_flush->
si_prim_discard_signal_next_compute_ib_start->u_suballocator_alloc->
si_resource_create->si_buffer_create->si_alloc_resource->
si_screen_clear_buffer->simple_mtx_lock
Fixes: 07a49bf597 "radeonsi: disable SDMA on gfx9"
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6941>
(cherry picked from commit 5e8791a0bf)
If we want to use HTILE correctly we need to communicate extra stuff
like clear colors. (Unlike DCC there is no HTILE FCE)
CC: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit d78df70c2a)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6877>
In the case where end was a instruction-based cursor, we would mix up
our blocks and end up with block_begin pointing after the second split.
This causes a segfault as the cf_node list walk at the end of the
function never terminates properly. There's also a possibility of
mix-up if begin is an instruction-based cursor which was found by
inspection.
Fixes: fc7f2d2364 "nir/cf: add new control modification API's"
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6866>
(cherry picked from commit 7dbb1f7462)
vl_mpeg12_decoder needs to override the chroma_format value to get the
correct size calculated (chroma_format is used by vl_video_buffer_adjust_size).
I'm not sure why it's needed, but this is needed to get correct mpeg decode.
Fixes: 24f2b0a856 ("gallium/video: remove pipe_video_buffer.chroma_format")
Acked-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6817>
(cherry picked from commit 2584d48b2c)
Fix defect reported by Coverity Scan.
Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking image suggests that it may be
null, but it has already been dereferenced on all paths leading to
the check.
Fixes: ad609bf55a ("frontend/dri: Implement mapping individual planes.")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6807>
(cherry picked from commit 03e7b75c22)
I noticed this once I started gathering xfb_info after
nir_lower_io_arrays_to_elements_no_indirect.
Fixes: b2bbd978d0 ("nir: fix lowering arrays to elements for XFB outputs")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6514>
(cherry picked from commit 5a88db682e)
There's no good reason not to use a symmetric rounding mode here. This
fixes the following GL CTS case for me:
GTF-GL33.gtf21.GL3Tests.texture_lod_bias.texture_lod_bias_all
Fixes: 132b69c4ed ("st/mesa: round lod_bias to a multiple of 1/256")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6892>
(cherry picked from commit 7685c37bf4)
As documented in the galcore kernel driver "only LOD0 is valid
for this register". This makes sense, as NTE's LINEAR_STRIDE is
only capable to store one linear stride value per sampler.
This fixes linear textures in sampler slot != 0.
Fixes: 34458c1cf6 ("etnaviv: add linear sampling support")
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Michael Tretter <m.tretter@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3285>
(cherry picked from commit a7e3cc7a0e)
the previous version of this leaked a reference to the streamout buffer here
thanks to deltragon on my blog for pointing this out!
Fixes: 37778fcd9a ("zink: implement transform feedback support to finish off opengl 3.0")
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6457>
(cherry picked from commit 7971918924)
Without this, we end up throwing errors on code along these lines when
rendering using single-buffering:
GLint att;
glGetIntegerv(GL_READ_BUFFER, &att);
glGetFramebufferAttachmentParameteriv(GL_READ_FRAMEBUFFER, att, ...);
This is because we internally translate GL_BACK (which is what
glGetIntegerv returned) to GL_FRONT, which we don't handle in the
Desktop GL case. So let's start handling it.
This fixes the GLTF-GL33.gtf21.GL2FixedTests.buffer_color.blend_color
test for me.
Fixes: e6ca6e587e ("mesa: Handle pbuffers in desktop GL framebuffer attachment queries")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6815>
(cherry picked from commit 9e13a16c97)
Section 5.11 (Out-of-Bounds Accesses) of the GLSL 4.60 spec says:
"In the subsections described above for array, vector, matrix and
structure accesses, any out-of-bounds access produced undefined
behavior.... Out-of-bounds reads return undefined values, which
include values from other variables of the active program or zero."
Robustness extensions suggest to return zero on out-of-bounds
accesses, however it's not applicable to the arrays of samplers,
so just clamp the index.
Otherwise instr->sampler_index or instr->texture_index would be out
of bounds, and they are used as an index to arrays of driver state.
E.g. this fixes such dereference:
if (options->lower_tex_packing[tex->sampler_index] !=
in nir_lower_tex.c
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6428>
(cherry picked from commit f2b17dec12)
Out-of-bounds writes could be eliminated per spec:
Section 5.11 (Out-of-Bounds Accesses) of the GLSL 4.60 spec says:
"In the subsections described above for array, vector, matrix and
structure accesses, any out-of-bounds access produced undefined
behavior.... Out-of-bounds writes may be discarded or overwrite
other variables of the active program."
Fixes: 1235850522
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6428>
(cherry picked from commit 0ba82f78a5)
Out-of-bounds writes could be eliminated per spec:
Section 5.11 (Out-of-Bounds Accesses) of the GLSL 4.60 spec says:
"In the subsections described above for array, vector, matrix and
structure accesses, any out-of-bounds access produced undefined
behavior....
Out-of-bounds writes may be discarded or overwrite
other variables of the active program.
Out-of-bounds reads return undefined values, which
include values from other variables of the active program or zero."
GL_KHR_robustness and GL_ARB_robustness encourage us to return zero
for reads.
Otherwise get_io_offset would return out-of-bound offset which may
result in out-of-bound loading/storing of inputs/outputs,
that could cause issues in drivers down the line.
E.g. this fixes such dereference:
int vue_slot = vue_map->varying_to_slot[intrin->const_index[0]];
in brw_nir.c
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6428>
(cherry picked from commit 66669eb529)
From the Vulkan 1.2.154 spec:
"If pCounterBufferOffsets is NULL, then it is assumed the
offsets are zero."
Fix new CTS
dEQP-VK.transform_feedback.simple.backward_dependency_no_offset_array.
CC: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6798>
(cherry picked from commit 2b99e15d0a)
Lets just do depth interop imports by convention between radv and
radeonsi for now. The only thing using this should be Vulkan interop
anyway.
CC: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6617>
(cherry picked from commit ecc19e9819)
It's not really unordered in the sense that it can still stall on
ordered things and we don't need a SYNC_NOP for that because it is a
SYNC_NOP. However, it also doesn't count when computing instruction
distances.
Fixes: 18e72ee210 "intel/fs: Add FS_OPCODE_SCHEDULING_FENCE"
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6781>
(cherry picked from commit f63ffc18e7)
If radv_layout_can_fast_clear() is false, 028C70_COMPRESSION is unset when
the image is rendered to and CMASK isn't updated. This appears to cause
FMASK to be ignored and the 0th sample to always be used.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3449
Fixes: 7b21ce401f
('radv: disable FMASK compression when drawing with GENERAL layout')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6745>
(cherry picked from commit 85cc2950a0)