pass_flags is only initialized for grouped loads, so change the order
Fixes: 33b4eb149e - nir: add new SSA instruction scheduler grouping loads into indirection groups
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16090>
(cherry picked from commit f7a77ff900)
these are translated into memory+control barriers in nir, and only
the control barrier needs to be handled
these semantics match what glslang does, so they must be right
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15959>
(cherry picked from commit 55baf0c676)
The Iris code that deals with implicit tracking is protected by
bufmgr->bo_deps_lock. Before this patch, we hold this lock during
update_batch_syncobjs() but don't keep it held until we actually
submit the batch in the execbuf ioctl. This can lead to the following
race condition:
- Context C1 generates a batch B1 that signals syncobj S1.
- Context C2 generates a batch B2 that depends on something that B1
from C1 is using, so we mark B2 as having to wait syncobj S1.
- C2 calls submit_batch() before C1 does it.
- The Kernel detects it was told to wait on syncobj S1 that was
never even submitted, so it returns EINVAL to the execbuf ioctl.
- We run abort() at the end of _iris_batch_flush().
- If DEBUG is defined, we also print:
iris: Failed to submit batchbuffer: Invalid argument
I couldn't figure out a way to reproduce this issue with real
workloads, but I was able to write a small reproducer to trigger this.
Basically it's a little GL program that has lots of contexts running
in different threads submitting compute shaders that keep using the
same SSBOs. I'll submit this as a piglit test. Edit: Tapani found a
dEQP test case which fails intermintently without this fix, so I'm not
sure a new Piglit is worth it now.
The solution itself is quite simple: just keep bo_deps_lock held all
the way from update_batch_syncobjs() until ioctl(). In order to make
that easier we just call update_batch_syncobjs() a little later. We
have to drop the lock as soon as the ioctl returns because removing
the references on the buffers would trigger other functions to try to
grab the lock again, leading to deadlocks.
Thanks to Kenneth Graunke for pointing out this issue.
This has also been confirmed to fix a dEQP test that was giving
intermittent failures:
dEQP-EGL.functional.sharing.gles2.multithread.random.images.copyteximage2d.12
v2: Move decode_batch() out, just to be safe (Jason).
v3: Do it all after assembling validation_list (Ken).
Cc: mesa-stable
Fixes: 89a34cb845 ("iris: switch to explicit busy tracking")
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14964>
(cherry picked from commit 3532c374de)
src/gallium/auxiliary/tgsi/tgsi_scan.c:287: scan_src_operand: Assertion `info->sampler_targets[index] == target' failed.
assert was being triggered by
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_multisampled_to_singlesampled_blit
using the stencil fallback with zink.
Fixes: f05dfddeb1 ("u_blitter: fix stencil blit fallback for crocus.")
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16069>
(cherry picked from commit 4b7ba3869b)
When setting the dst framebuffer width height, it might be silly
to constrain this beyond the dst resource, but at least constrain
it correctly to take account of x/y offsets.
This fixes some uses of this as a fallback for zink with
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_stencil_blit
Fixes: b4c07a8a87 ("gallium/util: allow scaling blits for stencil-fallback")
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16069>
(cherry picked from commit dbc264f504)
Call flush_pipeline_select_3d in CmdBeginRendering() to emit a dummy MEDIA_VFE_STATE
before switching from GPGPU to 3D.
Original commit with the fix:
bc612536 ("anv: Emit a dummy MEDIA_VFE_STATE before switching from GPGPU to 3D")
Fixes: 3501a3f9ed ("anv: Convert to 100% dynamic rendering")
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6201
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15954>
(cherry picked from commit 785b6579ae)
this was incorrectly calculating too small of a map region if
the stride was less than the size of the struct
Fixes: 3eb9932317 ("aux/draw: add a util function for reading back indirect draw params")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15963>
(cherry picked from commit efca37d415)
this is only possible when tc determines the buffer is not in use
and decides to return a pointer immediately, so just give back a staging
buffer
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15979>
(cherry picked from commit d7256043b3)
this was an attempt to minimize the number of xfb barriers being emitted,
but really xfb barriers need to always be emitted in order for xfb to work
cc: mesa-stable
fixes (nv):
KHR-GL46.texture_view.reference_counting
KHR-GL46.transform_feedback_overflow_query_ARB.multiple-streams-multiple-buffers-per-stream
KHR-GL46.transform_feedback_overflow_query_ARB.multiple-streams-one-buffer-per-stream
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16065>
(cherry picked from commit e509598470)
Conflicts:
src/gallium/drivers/zink/ci/zink-nv-fails.txt
a read barrier is needed for resume, yes, but the counter buffer
is always being written to, so write access must always be set
cc: mesa-stable
fixes (nv):
KHR-GL46.transform_feedback.draw_xfb_test
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16065>
(cherry picked from commit fc5edf9b68)
Conflicts:
src/gallium/drivers/zink/ci/zink-nv-fails.txt
this was well-documented, but ultimately wrong: the synchronization
being used was for binding streamout buffers (not counter buffers) as
vertex buffers, which was already handled just fine in the normal
vertex buffer binding
drawing from streamout ONLY uses the counter buffer, which means
the counter buffer needs to be synchronized for reading
cc: mesa-stable
fixes (nv):
KHR-GL46.transform_feedback.draw_xfb_feedbackk_test
KHR-GL46.transform_feedback.draw_xfb_instanced_test
KHR-GL46.transform_feedback.draw_xfb_stream_instanced_test
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16065>
(cherry picked from commit a056cbc691)
Conflicts:
src/gallium/drivers/zink/ci/zink-nv-fails.txt
The Command allocator and command list type must match, but we
are forcing it to D3D12_COMMAND_LIST_TYPE_DIRECT in the reset path.
Fixes: a012b21964 ("microsoft: Initial vulkan-on-12 driver")
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16023>
(cherry picked from commit 9fd02d49b8)
There is no reason not to be able to get it.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 34a0ce58c7 ("anv: add a new execution mode for secondary command buffers")
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15968>
(cherry picked from commit 184084e21c)
VkObjectType and VkDebugReportObjectTypeEXT has the same enum-values.
Why the Vulkan WG thought this was a good idea, beats me. But it's what
we have to live with now.
Anyway, instead of having a statement that implicitly casts two
different values from the former to the latter, let's fully relsove the
type as the former, and cast the value when using it instead.
Fixes: 41318a5819 ("vulkan: Use vk_object_base::type for debug_report")
Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15547>
(cherry picked from commit b27a2ba4fc)
the pipe cap is used for gating wideline support, so this will always
be 1.0 when not supported
furthermore, the previous code wasn't accurately checking line width
for tess shaders, breaking tests
cc: mesa-stable
fixes (nv):
KHR-GL46.tessellation_shader.tessellation_control_to_tessellation_evaluation.gl_PatchVerticesIn
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15960>
(cherry picked from commit d8b66fcbf9)
this is triggered by u_blitter when doing src==dst blits
Fixes: 7781a75229 ("zink: add a renderpass flag for mixed zs layout")
affects:
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality*
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15960>
(cherry picked from commit 9409756ee3)
if a rendertarget-specified image can't be a rendertarget or a blit dst
then it can't be used for the designated functionality and must be rejected
cc: mesa-stable
fixes hangs on various nv driver versions:
dEQP-GLES2.functional.texture.mipmap.2d.generate.rgba5551_fastest
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15960>
(cherry picked from commit 37ac8647fc)
this is illegal, and we'll just have to eat some piglit fails
until indirects are handled
Fixes: f7ade1f188 ("zink: simplify shader i/o assignment")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15976>
(cherry picked from commit 12cf9a1544)
Conflicts:
src/gallium/drivers/zink/ci/zink-lvp-fails.txt
If we change the sate without flushing the bitmap cache, the cache might be
rendered with the new scissor, which excludes some parts that should've
been rendered with the old state, and vice versa.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6233
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15881>
(cherry picked from commit dd7278aa10)
Here we just make sure we match the interpolation type on both
sides of the shader interface. Drivers like d3d12 are expecting
this.
Fixes: 9401990e6f ("nir/linker: set varying from uniform as flat")
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16003>
(cherry picked from commit 4b4bb46af4)
This is used to determine the geometry shader info on GFX9, and it
looks like it was broken for topologies that use adjacency.
This is also used to remove PSIZ from shaders that don't need it.
Found by inspection.
fossils-db (Polaris10):
Totals from 140 (0.10% of 135960) affected shaders:
SGPRs: 10448 -> 9696 (-7.20%)
VGPRs: 4376 -> 4264 (-2.56%)
CodeSize: 164316 -> 161028 (-2.00%)
Instrs: 26449 -> 25767 (-2.58%)
Latency: 184448 -> 180468 (-2.16%)
InvThroughput: 80772 -> 79092 (-2.08%)
VClause: 337 -> 328 (-2.67%); split: -2.97%, +0.30%
SClause: 859 -> 813 (-5.36%); split: -5.70%, +0.35%
Copies: 1027 -> 790 (-23.08%)
PreSGPRs: 2751 -> 2331 (-15.27%)
PreVGPRs: 3887 -> 3836 (-1.31%)
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15948>
(cherry picked from commit ed7d831525)
When we put NIR in the compiler stack for r300, indirect addressing broke
for gallium nine. DX's array indirects round the float value, so the DX
shader gets mapped to a TGSI "ARR ADDR[0] src.x" instruction. Translating
that to NIR maps to r0[f2i32(fround(src.x))]. While we might hope that in
translation back using nir-to-tgsi after optimization we would recognize
the construct and emit ARR again, that's going to be error prone (think
"what if src.x is in a NIR register?") so we need a fallback plan. r300
will be able to handle this lowering, so get it in place first to fix the
regression.
Fixes: #6297
Fixes: 7d2ea9b0ed ("r300: Request NIR shaders from mesa/st and use NIR-to-TGSI.")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15870>
(cherry picked from commit 6947016b46)