Some games, ie. Doom Eternal, present from compute following compute post-fx and would benefit from having DCC image stores available.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12862>
Fills the "Wave mode" in "Pipelines" for GPUs that supports Wave32.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12896>
Fills the "Scatch Mem" with "Yes/No" in "Pipelines", this requires
instruction timing to be enabled.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12896>
PSIZ output is only needed when:
1. There is a next stage and it reads it.
2. Primitive topology is point list, in the last vertex pipeline stage.
Zink always adds this output in its vertex (and other) shaders,
because it helps Zink avoid recompiling shader variants.
However, this has a performance impact for RADV because
it needs a scalar memory load. That becomes noticeable
at high primitive rates.
The Fossil stats are unremarkable because our DB doesn't include any
shaders from Zink or D9VK, but there are a few affected shaders.
Note that there may be an increase in LDS use in some GS. This is
because with PSIZ removed the ES per-vertex LDS size is smaller, so
we can squeeze more GS threads in the same workgroup.
Fossil DB stats on Sienna Cichlid:
Totals from 14 (0.01% of 128647) affected shaders:
CodeSize: 119884 -> 119732 (-0.13%)
LDS: 235008 -> 228864 (-2.61%); split: -2.83%, +0.22%
Instrs: 23076 -> 23048 (-0.12%)
Latency: 71667 -> 71625 (-0.06%)
InvThroughput: 19155 -> 18870 (-1.49%)
Copies: 1586 -> 1572 (-0.88%)
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10725>
Not ideal but ac/llvm and RADV works with integers, so passing a
16-bit float type would break more than it helps.
Fixes a few CTS with 16-bit float IO.
Fixes: 3fb229e010 ("ac,radeonsi: load VS inputs at the call site of nir_intrinsic_load_input")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12835>
The Wave32 pass manager has been removed a while ago.
Fixes: 94a1f45e15 ("ac/llvm: set target features per function instead of per target machine")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12833>
Some tokens can be excluded without instruction timing. This reduces
RGP capture sizes significantly.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12853>
Additional work is needed for storage images with DCC without DCC image stores to not be broken.
Fixes black screens in Doom Eternal.
Fixes: #5345
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12818>
By default, Mesa's X11 Vulkan WSI will wait for buffers to be ready
before submitting them to Xwayland when the swapchain is created
with the IMMEDIATE mode.
This is undesirable when the Wayland compositor already monitors
fences. A Wayland compositor may want to know the delay between
the buffer submition and the end of the GPU work, this is impossible
to measure if the WSI waits for the buffer to be ready before
submission.
Since most compositors don't monitor fences, let's introduce a driconf
option for this for now. We can reconsider once more compositors
have better support for fences.
Signed-off-by: Simon Ser <contact@emersion.fr>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11290>
Clearing the first sample is enough as long as CMASK is also cleared
to indicate that other samples are also cleared.
I verified that the first sample is always at the beginning of 256B
blocks.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12483>
Doesn't seem to be causing any issues right now but could with modifiers potentially.
Matches what is in RadeonSI where the comment is also shamelessly stolen from.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12811>
This is allowed by the Vulkan spec and we have to handle this situation
internally. We used to create and bind a 4096x4096 image to copy the
VRS rates but this wasted too much VRAM (~33MiB). Now, the driver only
allocates a HTILE buffer (~1MiB) and bind it to the framebuffer.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12243>
When a subpass is bound without a VRS attachment, the driver has to
create one internally and the copy can be a write only operation.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12243>
It has never really been used due to various issues with GDS in the
past and it will be lowered in NIR at some point.
The driver support is still there because it can likely be re-used.
This implementation can also be used as a reference point.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12695>
This estimation was incorrect, the number of waves doesn't only
depend of the number of VGPRs.
Though, {SPI,COMPUTE}_TMPRING_SIZE.WAVES should limit the number of
scratch waves in flight, not sure if limiting it really works.
This fixes a GPU hang with an upcoming game, and this might also
helps resolving some spurious random GPU hangs.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12700>
VK_FORMAT_R32_SFLOAT seems pretty common and it seems we can be a
little smarter when shader image 32-bit float atomics aren't enabled.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12406>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12671>
this would otherwise result in (UINT64_MAX - gettime()), which can effectively
be rounded to UINT64_MAX without a noticeable change
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12680>
==244108== Conditional jump or move depends on uninitialised value(s)
==244108== at 0x48498D5: bcmp (vg_replace_strmem.c:1129)
==244108== by 0x1C37B7DD: radv_bind_dynamic_state (radv_cmd_buffer.c:237)
==244108== by 0x1C388027: radv_CmdBindPipeline (radv_cmd_buffer.c:4794)
==244108== by 0x14E9C01E: bool update_gfx_pipeline<true>(zink_context*, zink_batch_state*, pipe_prim_type) (zink_draw.cpp:406)
==244108== by 0x14E9AAB9: void zink_draw_vbo<(zink_multidraw)1, (zink_dynamic_state)1, (zink_dynamic_state2)1, (zink_dynamic_vertex_input)1, true>(pipe_cont>
==244108== by 0x14B017EB: tc_call_draw_single (u_threaded_context.c:3033)
==244108== by 0x14AF9C0E: tc_batch_execute (u_threaded_context.c:190)
==244108== by 0x14AFA24F: _tc_sync (u_threaded_context.c:341)
==244108== by 0x14B006E7: tc_texture_subdata (u_threaded_context.c:2549)
==244108== by 0x14238F8C: st_TexSubImage (st_cb_texture.c:2134)
==244108== by 0x14239931: st_TexImage (st_cb_texture.c:2363)
==244108== by 0x1453698A: teximage (teximage.c:3154)
==244108== by 0x1453698A: teximage_err (teximage.c:3181)
==244108== by 0x145388BD: _mesa_TexImage2D (teximage.c:3252)
==244108== by 0x5E88D4: ??? (in /home/zmike/src/piglit/tesseract/bin_unix/linux_64_client)
==244108== by 0x5E9527: ??? (in /home/zmike/src/piglit/tesseract/bin_unix/linux_64_client)
==244108== by 0x5E9B72: ??? (in /home/zmike/src/piglit/tesseract/bin_unix/linux_64_client)
==244108== by 0x5F1092: ??? (in /home/zmike/src/piglit/tesseract/bin_unix/linux_64_client)
==244108== by 0x5F10AC: ??? (in /home/zmike/src/piglit/tesseract/bin_unix/linux_64_client)
==244108== by 0x48CC66: ??? (in /home/zmike/src/piglit/tesseract/bin_unix/linux_64_client)
==244108== by 0x48DDC7: ??? (in /home/zmike/src/piglit/tesseract/bin_unix/linux_64_client)
==244108== by 0x40D525: ??? (in /home/zmike/src/piglit/tesseract/bin_unix/linux_64_client)
==244108== by 0x4FF7B74: (below main) (in /usr/lib64/libc-2.33.so)
==244108== Uninitialised value was created by a stack allocation
==244108== at 0x14ECDF55: zink_create_gfx_pipeline (zink_pipeline.c:53)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12618>