There is nothing preventing ACO from generating loads with unused
components. This happens often with GLSL uniforms. Some of those loads
are partially re-vectorized after this.
radeonsi+ACO:
TOTALS FROM AFFECTED SHADERS (19564/58918)
VGPRs: 732900 -> 728448 (-0.61 %)
Spilled SGPRs: 429 -> 433 (0.93 %)
Code Size: 38446004 -> 38485612 (0.10 %) bytes
Max Waves: 305440 -> 305549 (0.04 %)
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29399>
Use 3DSTATE_URB_ALLOC_* instruction to program URB for multislice device
config.
In case only one slice is available in the device, SliceN fields will be
ignored by HW.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32736>
Use 3DSTATE_URB_ALLOC_* instruction to program URB for multislice device
config.
In case only one slice is available in the device, SliceN fields will be
ignored by HW.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32736>
Use 3DSTATE_URB_ALLOC_* instruction to program URB for multislice device
config.
In case only one slice is available in the device, SliceN fields will be
ignored by HW.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32736>
Instead of using a different voffset VGPR per streamout vertex,
point voffset to the first vertex for all 3 vertices because
the stride and vertex index are constant and can be in the immediate
offset.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32686>
Walk the whole vertex stride thanks to XFB info sorted by offset, gather
individual components from same or different outputs, and once we have
gathered 4, store them as vec4.
It also removes the memory_modes field from VMEM stores because I don't
think it's needed.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32686>
Walk the whole vertex stride thanks to XFB info sorted by offset, gather
individual components from same or different outputs, and once we have
gathered 4, store them as vec4.
It also removes the COHERENT flag from VMEM stores because NGG streamout
doesn't use it either and I don't think it's needed.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32686>
The mesa crate should just provide the means of creating those, but the
logic of what to create shouldn't be there.
The passed in arguments also heavily vary, and this way we can be explicit
about what variants needs what inputs.
Reviewed-by: @LingMan
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32903>
This allows us to use more contextual information of the image to create
the pipe_sampler_view properly instead of passing all the required
properties via function arguments.
Reviewed-by: @LingMan
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32903>
This simplifies sampler view tracking a bit for us. Also, drivers will
automatically free the pipe_sampler_view as well.
It was wrong to call into sampler_view_destroy directly anyway, because
pipe_sampler_view is a refcounted object and pipe_sampler_view_reference
should be used instead.
Reviewed-by: @LingMan
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32903>
I stumbled on this limit - it turns out that large local_sizes apply an
additonal limit on gprs per thread. If we violate this limit, then dmesg
just gives us a rather unhelpful message that the channel is killed:
nouveau 0000:01:00.0: gsp: rc engn:00000001 chid:64 type:13 scope:1 part:233
nouveau 0000:01:00.0: fifo:c00000:0008:0040:[hw_tests::test_[14761]] errored - disabling channel
Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32952>
copy requires non-null pointers even for zero-size copies. Skip the
call so it's legal to pass in null buffers of zero size.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32952>
Use Undefined Behaviour Sanitizer to detect issues in v3d/v3dv, as well
as in vc4.
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30880>
This adds build jobs to support Undefined Behaviour Sanitizer (UBSan),
both in x86_64 and arm64.
Reviewed-by: Eric Engestrom <eric@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30880>
Cast the value used in static assertion to be an unsigned integer,
instead of default signed integer.
This has been detected by Undefined Behaviour Sanitizer (UBSan).
```
../src/gallium/drivers/etnaviv/etnaviv_state.c:289:62: error: expression in static assertion is not constant
289 | static_assert((VIVS_PS_OUTPUT_REG2_SATURATE_RT4 << 24) == VIVS_PS_OUTPUT_REG2_SATURATE_RT7, "VIVS_PS_OUTPUT_REG2_SATURATE_RT7");
```
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30880>
This fixes VirglStagingMgr tests that tries to access a struct member of
a structure that is NULL.
This has been detected using Undefined Behaviour Sanitizer.
```
Running main() from ../src/gtest/src/gtest_main.cc
[==========] Running 9 tests from 2 test suites.
[----------] Global test environment set-up.
[----------] 7 tests from VirglStagingMgr
[ RUN ] VirglStagingMgr.non_fitting_allocation_reallocates_resource
stderr:
../src/gallium/drivers/virgl/tests/virgl_staging_mgr_test.cpp:72:22: runtime error: member access within null pointer of type 'struct virgl_hw_res'
```
Reviewed-by: Eric Engestrom <eric@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30880>
Detected when working on adding support for Undefined Behaviour
Sanitizer, this fixes:
```
../src/gallium/drivers/v3d/v3d_screen.c: In function 'v3d_get_compute_param.part.0':
../src/gallium/drivers/v3d/v3d_screen.c:480:17: error: null destination pointer [-Werror=format-overflow=]
480 | sprintf(ret, "v3d");
| ^~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
```
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30880>
Detected when working on adding support for Undefined Behaviour
Sanitizer, this fixes:
```
../src/gallium/drivers/radeonsi/radeon_vcn_dec.c: In function 'get_h264_msg':
../src/gallium/drivers/radeonsi/radeon_vcn_dec.c:239:50: error: 'k' may be used uninitialized [-Werror=maybe-uninitialized]
239 | && (k == ARRAY_SIZE(dec->h264_valid_poc_num))) {
| ~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../src/gallium/drivers/radeonsi/radeon_vcn_dec.c:77:19: note: 'k' was declared here
77 | unsigned i, j, k;
| ^
cc1: all warnings being treated as errors
```
Reviewed-by: David Rosca <david.rosca@amd.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30880>
Detected when working on adding support for Undefined Behaviour
Sanitizer, this fixes:
```
../src/gallium/drivers/freedreno/a2xx/ir2_nir.c: In function 'load_const':
../src/gallium/drivers/freedreno/a2xx/ir2_nir.c:154:24: error: 'swiz' may be used uninitialized [-Werror=maybe-uninitialized]
154 | unsigned imm_ncomp, swiz, idx, i, j;
| ^~~~
../src/gallium/drivers/freedreno/a2xx/ir2_nir.c:195:30: error: 'imm_ncomp' may be used uninitialized [-Werror=maybe-uninitialized]
195 | so->immediates[idx].ncomp = imm_ncomp;
| ~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~
../src/gallium/drivers/freedreno/a2xx/ir2_nir.c:154:13: note: 'imm_ncomp' was declared here
154 | unsigned imm_ncomp, swiz, idx, i, j;
| ^~~~~~~~~
cc1: all warnings being treated as errors
```
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30880>
It could run the companion batch buffer even if the main batch buffer
failed, that was possible to happen in i915 and Xe KMD.
In case the main context/queue is banned and companion is not it could
still return that submission was properly start what was not.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32850>
On i915 it could be executing the main batch buffer in
i915_queue_exec_locked() even if the perf query batch buffer failed.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32850>
Thanks to the migrations, we now have enough 1160g7-volteer DUTs
to increase the parallelism of pre-merge zink TGL testing. This
allows us to reduce the fraction of Piglit tests and introduce
fractional GLESCTS testing.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32864>
Move zink-anv-tgl-traces and zink-anv-tgl-traces-restricted to
the smaller 1130g7-volteer DUT. These jobs are quick and short,
allowing us to use the 1160g7-volteer device for more
performance-sensitive tasks.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32864>
Both virgl-iris-traces and virgl-iris-traces-performance jobs are
currently disabled for being broken, but we'll want to use the
smaller volteer DUT for them when they are re-enabled.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32864>
This job has never passed a run in a long time. It fails with
the following error when triggered:
head: cannot open '/dev/dri/renderD128' for reading: No such file or directory
Disable it until it's fixed.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32864>
This is lowered in backend compilers (LLVM or ACO) because it needs
to access ttmp registers which aren't exposed to NIR.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32940>
Fix defect reported by Coverity Scan.
Side effect in assertion (ASSERT_SIDE_EFFECT)
assert_side_effect: Argument ++eot_count of assert() has a side effect.
The containing function might work differently in a non-debug build.
Fixes: ebd6738260 ("intel/elk/chv: Implement WaClearArfDependenciesBeforeEot")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32884>
Fix defect reported by Coverity Scan.
Missing varargs init or cleanup (VARARGS)
missing_va_end: va_end was not called for ap.
Fixes: f8b584d6a5 ("vulkan/runtime,radv: Add shared BVH building framework")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32858>