We clone loads for uniforms for each user, which means that each
ppir_node will have its own, however an instruction may have several
nodes that need to load uniforms. Currently, in this case ppir compiler
will place all but one load uniforms nodes into a new instruction and
create an extra mov.
We can optimize that by checking whether load_uniform node in the instruction
is the same as the one we are trying to insert and reuse it. It is safe
to do so, because successors use pipeline register, so regalloc doesn't
care about it.
shader-db:
total instructions in shared programs: 27808 -> 27718 (-0.32%)
instructions in affected programs: 2652 -> 2562 (-3.39%)
helped: 31
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 2.90 x̃: 2
helped stats (rel) min: 0.56% max: 33.33% x̄: 5.12% x̃: 4.11%
95% mean confidence interval for instructions value: -3.78 -2.03
95% mean confidence interval for instructions %-change: -7.24% -2.99%
Instructions are helped.
total loops in shared programs: 4 -> 4 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0
total spills in shared programs: 391 -> 390 (-0.26%)
spills in affected programs: 1 -> 0
helped: 1
HURT: 0
total fills in shared programs: 1213 -> 1210 (-0.25%)
fills in affected programs: 3 -> 0
helped: 1
HURT: 0
LOST: 0
GAINED: 0
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33746>
This will make the job definition default to the UART format for vmware
jobs, as only Collabora's farm relies on the SSH job definition due to
the unreliable Chromebook UART in LAVA.
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33874>
Ideally this would be passed in pic params as the values are
in sequence header, but using the surface size also works.
Also add sanity checks for frame size.
Fixes decoding av1-1-b8-22-svc-L2T1 and av1-1-b8-22-svc-L2T2.
Cc: mesa-stable
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33737>
Also mark VK_EXT_device_memory_report as supported by anv in
docs/features.txt
Co-authored-by: shenghualin <shenghua.lin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33767>
This change was tested on cayman. This change fixes all the tests fixed by
the evergreen commit, and fixes the 32-bits integer tests as well:
spec/arb_texture_compression_bptc/texwrap formats bordercolor-swizzled/gl_compressed_rgb_bptc_signed_float, swizzled, border color only: fail pass
spec/arb_texture_compression_bptc/texwrap formats bordercolor-swizzled/gl_compressed_rgb_bptc_unsigned_float, swizzled, border color only: fail pass
spec/arb_texture_compression_bptc/texwrap formats bordercolor-swizzled/gl_compressed_rgba_bptc_unorm, swizzled, border color only: fail pass
spec/arb_texture_rg/texwrap formats-int bordercolor-swizzled/gl_r32i, swizzled, border color only: fail pass
spec/arb_texture_rg/texwrap formats-int bordercolor-swizzled/gl_r32ui, swizzled, border color only: fail pass
spec/arb_texture_rg/texwrap formats-int bordercolor-swizzled/gl_rg32i, swizzled, border color only: fail pass
spec/arb_texture_rg/texwrap formats-int bordercolor-swizzled/gl_rg32ui, swizzled, border color only: fail pass
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33579>
This change fixes a regression introduced with 8b5d41cacb.
Indeed, lookup_resid was not updated.
This change was tested on palm and cayman. Here are the tests fixed:
khr-gl4[3-5]/shader_image_size/advanced-nonms-cs-float: fail pass
khr-gl4[3-5]/shader_image_size/advanced-nonms-cs-int: fail pass
khr-gl4[3-5]/shader_image_size/advanced-nonms-cs-uint: fail pass
khr-gl4[3-5]/shader_image_size/advanced-nonms-fs-float: fail pass
khr-gl4[3-5]/shader_image_size/advanced-nonms-fs-int: fail pass
khr-gl4[3-5]/shader_image_size/advanced-nonms-fs-uint: fail pass
khr-gl4[3-5]/shader_image_size/basic-nonms-cs-float: fail pass
khr-gl4[3-5]/shader_image_size/basic-nonms-cs-int: fail pass
khr-gl4[3-5]/shader_image_size/basic-nonms-cs-uint: fail pass
khr-gl4[3-5]/shader_image_size/basic-nonms-fs-float: fail pass
khr-gl4[3-5]/shader_image_size/basic-nonms-fs-int: fail pass
khr-gl4[3-5]/shader_image_size/basic-nonms-fs-uint: fail pass
Fixes: 8b5d41cacb ("r600/sfn: Use range_base for atomics and images")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33352>
Per https://github.com/llvm/llvm-project/issues/119967 these
headers are internal implementation details of libclc and were
never supposed to be installed. They are not available anymore
since LLVM 20. Instead opencl-c.h should be used.
There already ise a code path for including opencl-c.h, so always
use it.
This didn't work for me out of the box, because the build system
currently hardcodes the clang resource directory, which is incorrect
for Fedora at least. Fix this by using GetResourcePath +
CLANG_RESOURCE_DIR provided by clang instead. This is basically
the same as what is done in clc_helper.c
I've still retained the old behavior as a fallback just in case
(e.g. if clang is linked statically?)
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33805>
RUN_COMPUTE_INDIRECT has been found to cause intermittent hangs, so
this change replaces it with RUN_COMPUTE and a set TASK_AXIS_X.
While this task axis might be suboptimal, the performance cost is
somewhat offset by RUN_COMPUTE not being an emulated command.
Fixes: 2ffc05d8d2 ("panvk: Add support for CmdDispatchIndirect")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33841>
RUN_COMPUTE_INDIRECT has been found to cause intermittent hangs, so
this change replaces it with RUN_COMPUTE and a set TASK_AXIS_X.
While this task axis might be suboptimal, the performance cost is
somewhat offset by RUN_COMPUTE not being an emulated command.
Fixes: 447075eeee ("panfrost: Add support for the CSF job frontend")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33841>
BC250 is known to have non-functional compute queue. Thousands
for Vulkan CTS tests fail, and many games are known to have visual
glitches. RADV_DEBUG=nocompute is the known workaround for all these
issues.
Disable compute queue for this chip in both radv and radeonsi.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33116>
AMD BC-250 is a mining board based on an AMD APU with an integrated GPU
that kernel recognizes as Cyan Skillfish.
It is basically RDNA1/GFX10, but with added hardware ray tracing
support. LLVM calls it GFX1013, see
https://llvm.org/docs/AMDGPU/AMDGPUAsmGFX1013.html
Support for this GPU hasn't been extensively tested. Some games are
known to work, some non-trivial ray query compute and ray tracing
pipeline rendering works too. Q2RTX works.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33116>
Skip the zero-index perfcntr collection only if CP_ALWAYS_COUNT is used,
since we want to collect that at the very end/start of collection. But
don't unwittingly ignore that perfcntr if CP_ALWAYS_COUNT isn't being
collected.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Fixes: 86456cf0e6 ("tu: support exposing derived counters through VK_KHR_performance_query")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33852>
As of recent changes, MESA_SHADER_GEOMETRY is handled by the if ladder.
CID: 1643918
Fixes: c33ebf09f5 ("iris: fix handling of GL_*_VERTEX_CONVENTION")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33842>
This change fixes the indirect draw 8-bits path which does
a conversion to 16-bits. This change is implemented to process
the parameters the same way as the other indirect draw paths.
This change was tested on palm and cayman. Here are the tests fixed:
deqp-gles31/functional/draw_indirect/draw_elements_indirect/indices/index_byte: fail pass
deqp-gles31/functional/draw_indirect/random/35: fail pass
deqp-gles31/functional/draw_indirect/random/45: fail pass
khr-gl40/draw_indirect/basic-indicesdatatype-unsigned_byte: fail pass
khr-gl41/draw_indirect/basic-indicesdatatype-unsigned_byte: fail pass
khr-gl42/draw_indirect/basic-indicesdatatype-unsigned_byte: fail pass
khr-gl43/draw_indirect/basic-indicesdatatype-unsigned_byte: fail pass
khr-gl44/draw_indirect/basic-indicesdatatype-unsigned_byte: fail pass
khr-gl45/draw_indirect/basic-indicesdatatype-unsigned_byte: fail pass
Fixes: d80701df8a ("r600g: Implement GL_ARB_draw_indirect for EG/CM")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Acked-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32802>
Wayland EGL has a driver invariant which requires that any `wl_surface`
(or wp_linux_drm_syncobj_surface_v1) calls happen inside the client's
call to eglSwapBuffers(). Submitting surface messages after
eglSwapBuffers() returns causes serialization issues with the Wayland
surface protocol and can lead to the compositor booting the app.
Fixes: 8ade5588e3 ("zink: add kopper api")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12736
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33859>
As explained by d80701df8a, the r600 hardware doesn't implement
the base vertex offset. This change implements this offset as a
constant buffer entry shared with lds.
This change is inspired from 3511a51be0 ("freedreno/ir3: handle
VTXID_BASE for indirect draws").
Note: this feature requires at least evergreen.
This change was tested on palm and cayman. Here are the tests fixed:
spec/arb_draw_indirect/gl_vertexid used with gldrawarraysindirect: fail pass
spec/arb_draw_indirect/gl_vertexid used with gldrawelementsindirect: fail pass
spec/arb_multi_draw_indirect/gl-3.0-multidrawarrays-vertexid -indirect: fail pass
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32769>
Add a specific timeout for the U-Boot action in LAVA job definitions for
rockchip devices. This ensures sufficient time for U-Boot to download
the kernel and set up early network, preventing potential job failures
due to timeout constraints.
This behavior started to happen since LAVA 2025.02 version.
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33839>
The `lava_ssh_test_case` wrapper was missing the `set -e` shell option,
which made LAVA system interpret the job was succeeding, because the
`container` namespace was exiting normally, even though the `dut`
namespace was failing.
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33839>
LAVA was recently patched [1] with a fix on how parameters are parsed in
`lava-test-case`, so we don't need to repeat quotes to send the
arguments properly to it.
[1] 18c9cf7976
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33839>
It's redundant with EGL surfaceless and it doesn't have much use.
It's also available from the amber branch, so distros should get it from
there if they want to continue packaging it.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33836>
lvp_private.h is in the same directory and lvp_android.c
Other files in the same directory just use
"#include lvp_private.h" too.
Reviewed-by: Lucas Fryzek <lfryzek@igalia.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Aaron Ruby <aruby@qnx.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33854>