In the following arrangement the old logic leads to the following:
|
v
+----------+------------+
|block5 |
|m815 = PHI m1034, m860 |<-----------+
|343 = FMA.f32 ... | |
+----------+------------+ |
| |
+--------------+ |
| | |
v v |
+-----+ +-----+ |
|b6 | |b7,8 | |
| | | | |
+-----+ +--+--+ |
| +---+ | +---+ |
+----|b9 +-----+----|b10+---+ |
v +---+ +---+ v |
+-------+-------------+ +-------+---------+ |
|block12 | |block11 | |
|m882 = PHI m815, m860| |m860 = MEMMOV 343+--+
+---------+-----------+ +-----------------+
v
The spill of / into m860 (corresponding to 343) ends up in block11 when
insert_coupling_code(succ=block5, pred=block11) because of the memory
phi in block5. Later, in insert_coupling_code(block12, block9), we
reject inserting the spill after ca9c9957. As a result, m860 is
undefined along block5 -> block7,8 -> block9 -> block12.
When the spill position is chosen first, ctx->block is block5 so
choose_spill_position falsely returns the fallback position. The issue
can be fixed by explicitly passing the "current block".
Fixes: ca9c9957 ("pan: Avoid some redundant SSA spills")
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
(cherry picked from commit 09e1ba28e5)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
0886be09 ("glsl: Allow precision mismatch on dead data with GLSL ES 1.00")
allowed precision mismatches on uniforms, however if you lower precision on
16-bit consts, then this error triggers instead.
So here we relax the type matching and just make sure we match int vs
float.
Fixes: 0886be09 ("glsl: Allow precision mismatch on dead data with GLSL ES 1.00")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5337
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 73bc604128)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
AMD docs support this:
R5xx Acceleration v1.5 says safest handling for ZFUNC changes is to disable
HiZ except specific LESS/LEQUAL and GREATER/GEQUAL transitions.
ATI OpenGL Programming and Optimization Guide advises avoiding ALWAYS when
trying to benefit from HiZ so that would imply fglrx also disables HiZ
there.
On RV530 this fixes the following dEQPs:
dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.43
dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.74
Fixes: 12dcbd5954 ("r300g: enable Hyper-Z by default on r500")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8093
(cherry picked from commit b0f019f8cf)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
The DISCARD_WHOLE_RESOURCE path in vc4_map_usage_prep() replaces the
resource's BO with vc4_resource_bo_alloc(). As the RCL resolves
rsc->bo at job submit in vc4_submit_setup_rcl_surface(), any pending
write job would store to the new BO instead of the old one, corrupting
the new written data.
This is the same bug that was fixed in v3d in the previous commit.
Fixes: 18ccda7b86 ("vc4: When asked to discard-map a whole resource, discard it.")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit ecb6c5d555)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
The DISCARD_WHOLE_RESOURCE path in v3d_map_usage_prep() replaces the
resource's BO with v3d_resource_bo_alloc(). As the RCL resolves
rsc->bo at job submit in emit_rcl() any pending write job would
store to the new BO instead of the old one, corrupting the new
written data.
This is adressed by flushing all pending write jobs affecting the
resource before replacing its BO.
This fixes multiple tests where data copied to a renderbuffer was
overwritten by a previos GPU clear. Test are from the subgroup:
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.*
Fixes: 45bb8f2957 ("broadcom: Add V3D 3.3 gallium driver called "vc5", for BCM7268.")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit 1eaa46da09)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
We might need to DCE users of dead instructions removed by
process_block().
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 9e8ba10447 ("aco/vn: remove dead instructions early")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
(cherry picked from commit 17b18496f6)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
We're setting this in the non-DRI codepath, but this was missed when we
started embedding the VA driver into libgallium. This means we no longer
were able to use VA-API from meson devenv, like we could before.
Fixes: 212d57f7e6 ("targets/va: Build va driver into libgallium when building with dri")
(cherry picked from commit 7e4744909b)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
This change is useful when the compute shader is called multiple
times with the atomic operations enabled. It fixes some data
coherency issues. This is done by moving
evergreen_emit_atomic_buffer_setup() after r600_flush_emit().
This change is also a partial fix for compute_shader.pipeline-compute-chain.
In this specific case, it makes the memory barrier working.
This change was tested on cayman and barts; it makes these tests
fully deterministic:
khr-gl4[2-6]/shader_atomic_counters/advanced-usage-many-dispatches: fail pass
khr-gles31/core/shader_atomic_counters/advanced-usage-many-dispatches: fail pass
deqp-gles31/functional/synchronization/inter_call/without_memory_barrier/atomic_counter_dispatch_.*_calls_.*_invocations: fail pass
Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
(cherry picked from commit dad942b468)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
The alpha instruction always wrote to the same rendertarget as the rgb and the
original target was ignored (surprisingly the HW docs explicitly allows rgb and
alpha to write to different targets). This makes tesseract rendering a bit
better, but there are still some remaining issues.
Fixes: 1c2c4ddbd1 ("r300g: copy the compiler from r300c")
Reviewed-by: Filip Gawin <filip@gawin.net>
(cherry picked from commit 87a881558f)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
When building with "-Dvideo-codecs=h264dec,h265dec,av1dec" va/encode.c
won't be built but it's still required because it's used from
picture.c
Fixes: c4f05bdf60 ("frontends/va: include picture_*.c based on selected codec")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 82a51ba9b3)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
Among all subcommands, only gfx subcommands are bound to a query pool,
other subcommands seem to need no special handling.
In addition, if a ResetQuery is done before BeginQuery, the last
subcommand will be a event one, which fails the current assert that
assumes it's a gfx one.
Change the assertion of the subcommand being a gfx one to an addition
check of whether the subcommand is a gfx one.
This fixes crash of Vulkan CTS 1.4.5.1 test
dEQP-VK.query_pool.discard.normal.no_depth.none.discard .
Backport-to: 26.0
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
(cherry picked from commit 5a497316d4)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
In 1f83e73145, the pre-encode input picture size was also reduced.
However it was recently discovered that VCN FW uses the input picture
pitch as the pitch for this, which means that previous change broke
pre-encode.
Fixes: 1f83e73145 ("radeonsi/vcn: Reduce allocated size for pre-encode recon pics")
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
(cherry picked from commit 2b2b1d405a)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
The logic is supposed to find the stage with the maximum constlen to
trim for each time we have to trim a stage. But by not resetting
max_constlen each time, we would "trim" the same stage repeatedly,
leaving us thinking the total is below the limit when it actually isn't.
Cc: mesa-stable
(cherry picked from commit ae8928b638)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
The HW uses ViewportIndex to select which GRAS_BIN_FOVEAT offset to use.
For normal 3d draws, either the ViewportIndex equals the view/layer or
we make the offset the same for all viewports/layers, but we aren't
aware of this in the 3d path and we always use viewport 0.
Use the HW offset 0 when subtracting the HW offset. This is a bit of a
hack, but it should work. This fixes LOAD_OP_LOAD with FDM.
Fixes: b34b089ca1 ("tu: Use GRAS bin offset registers")
(cherry picked from commit 68c0031f56)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
- RGBA8888_* is a preprocessor alias for R8G8B8A8_* in u_format.yaml.
- Both entries in the format tables collide on the same enum value, and
RGBA8888 overwrites R8G8B8A8.
- The fix was reverting to the version that was in the commit
39e949434c because there is a different format
was used that did not cause any collisions.
dEQP fixes:
dEQP-VK.api.info.format_properties.r8g8b8a8_sint
dEQP-VK.api.info.format_properties.r8g8b8a8_snorm
dEQP-VK.api.info.format_properties.r8g8b8a8_uint
dEQP-VK.api.info.format_properties.r8g8b8a8_unorm
Fixes: 9f740b26a6 ("pvr: Fix bugs in the format table")
Signed-off-by: Leon Perianu <leon.perianu@imgtec.com>
Tested-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
(cherry picked from commit 7c6dbb099a)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
We should be returning if no GS is needed and no GS shader is bound.
This fix various segfaults introduced by the original fix.
Signed-off-by: Mary Guillemard <mary@mary.zone>
Fixes: e10f29399f ("hk: fix passthrough GS key invalidation")
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Janne Grunau <j@jannau.net>
(cherry picked from commit 6d040df750)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
With perfetto that string is processed later leading to
use-after-free.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
(cherry picked from commit 413e169f45)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
Blorp emits 3DSTATE_BINDING_TABLE_POINTER_* instructions in 3D mode.
At the moment we're saved by the push constants reemitting the btp but
we'll drop that in the next commit.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
(cherry picked from commit 533c748b34)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
Cache flushes should be skipped on SDMA. In practice,
radv_emit_cache_flush() should only be called on GFX/ACE.
SDMA NOP packets are emitted in barriers directly.
This fixes recent VKCTS coverage
dEQP-VK.api.command_buffers.secondary_on_transfer_queue.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit c4d5090d69)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
This should definitely be an OR operation if MRT0 and MRT1 don't write
the same channels. This also requires to set the writemask manually
because when it's 0 (in case a dual-source output is missing), the
intrinsic computes the mask itself with the number of components.
No fossils-db changes on NAVI33.
Fixes: 45d8cd037a ("ac/nir: rewrite ac_nir_lower_ps epilog to fix dual src blending with mono PS")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14878
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 2eb9420061)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
On the Rogue architecture add support for using a fragment passthrough
shader when there is no fragment shader present in a graphics
pipeline but the sample mask is required.
fix:
dEQP-VK.pipeline.monolithic.empty_fs.masked_samples
Backport-to: 26.0
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Co-authored-by: Simon Perretta <simon.perretta@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 14508b4c9a)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
Configure the EOT setup for SPM EOT programs so that the generated
programs load the tile buffer into the output buffer before doing
the emit
Partial fix for:
dEQP-VK.renderpass.*.attachment_allocation.input_output.71
Backport-to: 26.0
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit d1f2ad17dd)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
In subpasses preserve attachments are not used by the subpass but
their contents must be preserved throughout the subpass.
Add a list for the preserve attachments info specified by a subpass
and when determining a subpass attachments total uses check the
preserve attachments list and add it uses to the total.
Partial fix for:
dEQP-VK.renderpass.*.attachment_allocation.input_output.71
Backport-to: 26.0
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 0e01b9ef2d)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
The struct will also be used for preserve attachments in the next
commit.
Backport-to: 26.0
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit e18670347a)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
When calculating the dwords per pixel the output registers should
always be taken into account in addition to the number of tile buffers.
Fixes incorrect scratch buffer space calculation when both output
registers and tile buffers are emitted by a render.
Partial fix for:
dEQP-VK.renderpass.*.attachment_allocation.input_output.71
Fixes: 3457f8083a ("pvr: Acquire scratch buffer on framebuffer creation.")
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit df445dc9b9)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
The subpass merging optimisation check for when subpasses are using
tile buffers was in the incorrect location.
The current check is in a function called from two places but only
the first of these should have been doing the optimisation check.
This was incorrectly affecting the number of renders that subpass
merging could avoid.
Partial fix for:
dEQP-VK.renderpass.*.attachment_allocation.input_output.71
Fixes: 10b6a0d567 ("pvr: Add support for generating render pass hw setup data.")
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 0640ac7e3b)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
Empirically, TCS outputs have to be aligned to 64 bytes,
otherwise stale data may be read in rare cases. The exact
reason is not clear, but tests and proprietary driver behavior
strongly point at the need for 64 byte alignment.
Fixes tesselation issues in at least "Conan Exiles" but likely in many
more cases.
CC: mesa-stable
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
(cherry picked from commit 47251b2e2d)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
The Vulkan spec says about VkFormatFeatureFlagBits:
If a format does not incorporate chroma downsampling (it is
not a “422” or “420” format) but the implementation supports
sampler Y′CBCR conversion for this format, the implementation
must set VK_FORMAT_FEATURE_MIDPOINT_CHROMA_SAMPLES_BIT.
Fixes: af062126ae
Signed-off-by: Benjamin Otte <otte@redhat.com>
(cherry picked from commit 0b6dd167ac)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
This is possible since VK_KHR_maintenance10.
This fixes new VKCTS coverage in
dEQP-VK.pipeline.*.multisample.m10_resolve.*.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit ab6147e8ef)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>