Using memset() to zero a few sequential fields in gl_pixelstore_attrib
is a bit dodgy (what if someone were to add/reorder fields?). And gcc
emits a warning in optimized builds:
In function ‘memset’,
inlined from ‘copy_converted_buffer’ at ../src/mesa/state_tracker/st_pbo_compute.c:1038:7,
inlined from ‘st_GetTexSubImage_shader’ at ../src/mesa/state_tracker/st_pbo_compute.c:1146:7:
/usr/include/x86_64-linux-gnu/bits/string_fortified.h:71:10: warning: ‘__builtin_memset’ offset [9, 24] from the object at ‘packing’ is out of the bounds of referenced subobject ‘RowLength’ with type ‘int’ at offset 4 [-Warray-bounds]
71 | return __builtin___memset_chk (__dest, __ch, __len, __bos0 (__dest));
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Just replace the memset with ordinary assignments.
Signed-off-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18261>
Only the workgroup size computation remains at the same place, but I
think it should be computed in a separate helper later.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18210>
This is used for compute and task shaders and will help for adding
new helpers.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18210>
That shouldn't change anything for VS as LS (or as ES) and for
TES as ES because radv_vs_output_info is only used by the last
vertex stage. So, if we have TES+GS, radv_vs_output_info for TES
will be overwritten by GS. This allows to decouple the shader info
pass from other stages.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18210>
Task shaders always use a ring, so this field was useless somehow.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18210>
This is probably a leftover when task shader has been reworked, but it
has no effect.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18210>
radv_nir_shader_info_pass() should run on individual shaders only, and
"linked" shader info should be done separately for better design.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18210>
This structure isn't really useful and it contains only one field.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18210>
Size is in bytes, not bits.
Fixes plenty of crashes in CI, like
dEQP-VK.synchronization.op.single_queue.event.write_image_fragment_read_image_tess_eval.image_128_r32_uint.
Fixes: 46f6e2ddbb ("aco: Implement storage image A16.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18266>
The following sequence would be broken if we don't re-emit viewports.
vkCmdSetViewport()
VkCmdBindPipeline(negative_one_to_one = false)
vkCmdDraw()
VkCmdBindPipeline(negative_one_to_one = true)
vkCmdDraw()
Found by inspection.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18245>
A better explanation for SP_HS_WAVE_INPUT_SIZE is that it is the size
of local memory to allocate per wave (which can be more than one
patch), in 256B units.
Then the maximum of 64 makes sense because only 16KB of local memory
is reserved for VS<->HS linkage.
The resulting formula matches the blob behaviour, even when
patch_control_points and tcs_vertices_out have different values,
while the past formula gave wrong answers on gen3+.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Suggested-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17957>
Mirrors 31835ac3b8 change in freedreno.
Together with "tu: Fix HS input size formula for gen3+" fixes following
tests from GL CTS running via Zink:
dEQP-GLES31.functional.tessellation.invariance.inner_triangle_set.quads_fractional_odd_spacing
dEQP-GLES31.functional.tessellation.invariance.inner_triangle_set.triangles_fractional_odd_spacing
dEQP-GLES31.functional.tessellation.invariance.primitive_set.triangles_fractional_odd_spacing_ccw
dEQP-GLES31.functional.tessellation.invariance.primitive_set.triangles_fractional_odd_spacing_cw
dEQP-GLES31.functional.tessellation.invariance.triangle_set.triangles_fractional_odd_spacing
dEQP-GLES31.functional.tessellation.primitive_discard.quads_fractional_odd_spacing_ccw
dEQP-GLES31.functional.tessellation.primitive_discard.quads_fractional_odd_spacing_cw
dEQP-GLES31.functional.tessellation.primitive_discard.triangles_fractional_odd_spacing_ccw
dEQP-GLES31.functional.tessellation.primitive_discard.triangles_fractional_odd_spacing_cw
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17957>
It can be useful not just to create functions, but also being able to
call them. This adds the spirv_builder-helper for this.
Cc: mesa-stable
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18244>
This type will be reused later on, so let's have the name describe what
is *is*, not what it's *used for*.
Cc: mesa-stable
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18244>
When SSBO instructions use constant address values the address loading
is immediately ready, scheduling the address loads early increases
the register pressure, so force a new instruction block to work around
this problem.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6975
Fixes: 79ca456b48
r600/sfn: rewrite NIR backend
v2: do handling in shader block to be thread save (hinted to by Filip)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Filip Gawin <filip@gawin.net> (v1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18212>
Limit the number of tested instructions and the number of
ready instructions that might be taken into account.
This reduces the time needed to run the scheduler significantly.
Fixes: 79ca456b48
r600/sfn: rewrite NIR backend
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18212>
Export instructions allow burst writes, so it makes send to try
to allocate consecutive registers, but for ring writes we don't
schedule the outputs correctly to exploit this, so for now
don't mark these instructions as export to let the RA restart
picking colors.
When the scheduler starts to emit the ring writes in the right order
to allow for bust writes we might revisit this.
This fixes
spec@glsl-1.50@execution@variable-indexing@gs-output-array-vec4-index-wr
Fixes: 79ca456b48
r600/sfn: rewrite NIR backend
Related: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6975
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18212>
Linear PE has already shown to have some rough corner cases in the hardware
and also has performance implications. Add a debug option to allow to disable
the feature, so users can more easily check if some issue is caused by this
feature.
CC: mesa-stable #22.2
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Guido Günther <agx@sigxcpu.org>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18232>
When linear rendering is used together with TS, the color tiles must be fully
contained in a single row of pixels. When wrapping around to the next row
TS gets confused and records wrong tile status information, leading to visual
corruption when the surface is resolved/decompressed.
The corruption can be fixed by increasing the stride alignment for linear
render targets, but that would break some existing use-cases, as some display
engines used together with Vivante GPUs currently don't support strides that
don't match the horizontal display resolution.
For now only enable linear PE rendering when the surface is properly aligned
already. This allows to use the optimization in a lot of common use-cases, but
falls back to the proven tiled rendering with subsequent resolve into linear
for the problematic cases.
CC: mesa-stable #22.2
Fixes: 53445284a4 ("etnaviv: add linear PE support")
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Guido Günther <agx@sigxcpu.org>
Reviewed-by: Guido Günther <agx@sigxcpu.org>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18232>
The decision whether to use fast clear aka TS currently checks for two
feature bits: FAST_CEAR and MC20. We check for MC20, as TS on MC1.0 bypasses
the memory offset and we don't have any way to fixup the GPU address to
account for that. It could be done with some support of the kernel driver,
but then GPUs with MC1.0 are very rare to find these days, so not sure if we
are ever going to bother with that.
Instead of checking two separate feature bits to determine if TS can be used,
mask out the FAST_CLEAR bit from the features when MC20 isn't present. This
way we only have to check for a single feature bit.
CC: mesa-stable #22.2
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Guido Günther <agx@sigxcpu.org>
Reviewed-by: Guido Günther <agx@sigxcpu.org>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18232>