BO_ALLOC_COHERENT is not a good name as it can mean 2 different memory
types: cached+coherent and uncached+coherent, so
here renaming it to BO_ALLOC_CACHED_COHERENT that is more close to the
usage that we have for it.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28774>
timestamp is not modified by CPU, it is written by GPU and just read
by CPU.
As all BOs in Iris are CPU coherent, there is no need to keep this
flag.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28774>
This message has been confusing users, especially now that
popular toolkits such as Gtk started using a Vulkan renderer.
Printing a message on non-conformant implementations is also
actually not required. So let's remove it.
We haven't fully finished the GFX12 implementation yet, but on
all other hardware, RADV should work just fine, and is definitely
not meant for "testing use only".
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12314
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32930>
DONTBLOCK is sort of almost good enough except that the api frontend
can also use this and it can't use the full power of Trust Me Buddy™
that qbo maps require
this causes unnecessary ioctl syncs, which annihilates perf in games
that constantly check query results
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31823>
this allows a return without checking syncobj, avoiding overhead,
but when a query still isn't completing after multiple checks then
try checking the pool directly
this circumvents the usual qbo mechanism in specific cases (e.g., Everspace)
where an app fires off a million timestamp queries and the overhead of
checking a timeline semaphore kills perf
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31823>
If there is a 4-byte hole between 2 loads, they are vectorized. Example:
load 4 + hole 4 + load 8 -> load 16
This helps GLSL uniform loads, which are often sparse. See the code for more
info.
RADV could get better code by vectorizing later.
radeonsi+ACO - TOTALS FROM AFFECTED SHADERS (45482/58355)
Spilled SGPRs: 841 -> 747 (-11.18 %)
Code Size: 67552396 -> 65291092 (-3.35 %) bytes
Max Waves: 714439 -> 714520 (0.01 %)
This should have no effect on LLVM because ac_build_buffer_load scalarizes
SMEM, but it's improved for some reason:
radeonsi+LLVM - TOTALS FROM AFFECTED SHADERS (4673/58355)
Spilled SGPRs: 1450 -> 1282 (-11.59 %)
Spilled VGPRs: 106 -> 107 (0.94 %)
Scratch size: 101 -> 102 (0.99 %) dwords per thread
Code Size: 14994624 -> 14956316 (-0.26 %) bytes
Max Waves: 66679 -> 66735 (0.08 %)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29399>
There is nothing preventing ACO from generating loads with unused
components. This happens often with GLSL uniforms. Some of those loads
are partially re-vectorized after this.
radeonsi+ACO:
TOTALS FROM AFFECTED SHADERS (19564/58918)
VGPRs: 732900 -> 728448 (-0.61 %)
Spilled SGPRs: 429 -> 433 (0.93 %)
Code Size: 38446004 -> 38485612 (0.10 %) bytes
Max Waves: 305440 -> 305549 (0.04 %)
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29399>
Use 3DSTATE_URB_ALLOC_* instruction to program URB for multislice device
config.
In case only one slice is available in the device, SliceN fields will be
ignored by HW.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32736>
Use 3DSTATE_URB_ALLOC_* instruction to program URB for multislice device
config.
In case only one slice is available in the device, SliceN fields will be
ignored by HW.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32736>
Use 3DSTATE_URB_ALLOC_* instruction to program URB for multislice device
config.
In case only one slice is available in the device, SliceN fields will be
ignored by HW.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32736>
Instead of using a different voffset VGPR per streamout vertex,
point voffset to the first vertex for all 3 vertices because
the stride and vertex index are constant and can be in the immediate
offset.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32686>
Walk the whole vertex stride thanks to XFB info sorted by offset, gather
individual components from same or different outputs, and once we have
gathered 4, store them as vec4.
It also removes the memory_modes field from VMEM stores because I don't
think it's needed.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32686>
Walk the whole vertex stride thanks to XFB info sorted by offset, gather
individual components from same or different outputs, and once we have
gathered 4, store them as vec4.
It also removes the COHERENT flag from VMEM stores because NGG streamout
doesn't use it either and I don't think it's needed.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32686>
The mesa crate should just provide the means of creating those, but the
logic of what to create shouldn't be there.
The passed in arguments also heavily vary, and this way we can be explicit
about what variants needs what inputs.
Reviewed-by: @LingMan
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32903>
This allows us to use more contextual information of the image to create
the pipe_sampler_view properly instead of passing all the required
properties via function arguments.
Reviewed-by: @LingMan
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32903>
This simplifies sampler view tracking a bit for us. Also, drivers will
automatically free the pipe_sampler_view as well.
It was wrong to call into sampler_view_destroy directly anyway, because
pipe_sampler_view is a refcounted object and pipe_sampler_view_reference
should be used instead.
Reviewed-by: @LingMan
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32903>
I stumbled on this limit - it turns out that large local_sizes apply an
additonal limit on gprs per thread. If we violate this limit, then dmesg
just gives us a rather unhelpful message that the channel is killed:
nouveau 0000:01:00.0: gsp: rc engn:00000001 chid:64 type:13 scope:1 part:233
nouveau 0000:01:00.0: fifo:c00000:0008:0040:[hw_tests::test_[14761]] errored - disabling channel
Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32952>
copy requires non-null pointers even for zero-size copies. Skip the
call so it's legal to pass in null buffers of zero size.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32952>
Use Undefined Behaviour Sanitizer to detect issues in v3d/v3dv, as well
as in vc4.
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30880>
This adds build jobs to support Undefined Behaviour Sanitizer (UBSan),
both in x86_64 and arm64.
Reviewed-by: Eric Engestrom <eric@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30880>
Cast the value used in static assertion to be an unsigned integer,
instead of default signed integer.
This has been detected by Undefined Behaviour Sanitizer (UBSan).
```
../src/gallium/drivers/etnaviv/etnaviv_state.c:289:62: error: expression in static assertion is not constant
289 | static_assert((VIVS_PS_OUTPUT_REG2_SATURATE_RT4 << 24) == VIVS_PS_OUTPUT_REG2_SATURATE_RT7, "VIVS_PS_OUTPUT_REG2_SATURATE_RT7");
```
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30880>
This fixes VirglStagingMgr tests that tries to access a struct member of
a structure that is NULL.
This has been detected using Undefined Behaviour Sanitizer.
```
Running main() from ../src/gtest/src/gtest_main.cc
[==========] Running 9 tests from 2 test suites.
[----------] Global test environment set-up.
[----------] 7 tests from VirglStagingMgr
[ RUN ] VirglStagingMgr.non_fitting_allocation_reallocates_resource
stderr:
../src/gallium/drivers/virgl/tests/virgl_staging_mgr_test.cpp:72:22: runtime error: member access within null pointer of type 'struct virgl_hw_res'
```
Reviewed-by: Eric Engestrom <eric@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30880>
Detected when working on adding support for Undefined Behaviour
Sanitizer, this fixes:
```
../src/gallium/drivers/v3d/v3d_screen.c: In function 'v3d_get_compute_param.part.0':
../src/gallium/drivers/v3d/v3d_screen.c:480:17: error: null destination pointer [-Werror=format-overflow=]
480 | sprintf(ret, "v3d");
| ^~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
```
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30880>
Detected when working on adding support for Undefined Behaviour
Sanitizer, this fixes:
```
../src/gallium/drivers/radeonsi/radeon_vcn_dec.c: In function 'get_h264_msg':
../src/gallium/drivers/radeonsi/radeon_vcn_dec.c:239:50: error: 'k' may be used uninitialized [-Werror=maybe-uninitialized]
239 | && (k == ARRAY_SIZE(dec->h264_valid_poc_num))) {
| ~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../src/gallium/drivers/radeonsi/radeon_vcn_dec.c:77:19: note: 'k' was declared here
77 | unsigned i, j, k;
| ^
cc1: all warnings being treated as errors
```
Reviewed-by: David Rosca <david.rosca@amd.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30880>
Detected when working on adding support for Undefined Behaviour
Sanitizer, this fixes:
```
../src/gallium/drivers/freedreno/a2xx/ir2_nir.c: In function 'load_const':
../src/gallium/drivers/freedreno/a2xx/ir2_nir.c:154:24: error: 'swiz' may be used uninitialized [-Werror=maybe-uninitialized]
154 | unsigned imm_ncomp, swiz, idx, i, j;
| ^~~~
../src/gallium/drivers/freedreno/a2xx/ir2_nir.c:195:30: error: 'imm_ncomp' may be used uninitialized [-Werror=maybe-uninitialized]
195 | so->immediates[idx].ncomp = imm_ncomp;
| ~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~
../src/gallium/drivers/freedreno/a2xx/ir2_nir.c:154:13: note: 'imm_ncomp' was declared here
154 | unsigned imm_ncomp, swiz, idx, i, j;
| ^~~~~~~~~
cc1: all warnings being treated as errors
```
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30880>
It could run the companion batch buffer even if the main batch buffer
failed, that was possible to happen in i915 and Xe KMD.
In case the main context/queue is banned and companion is not it could
still return that submission was properly start what was not.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32850>
On i915 it could be executing the main batch buffer in
i915_queue_exec_locked() even if the perf query batch buffer failed.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32850>
Thanks to the migrations, we now have enough 1160g7-volteer DUTs
to increase the parallelism of pre-merge zink TGL testing. This
allows us to reduce the fraction of Piglit tests and introduce
fractional GLESCTS testing.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32864>
Move zink-anv-tgl-traces and zink-anv-tgl-traces-restricted to
the smaller 1130g7-volteer DUT. These jobs are quick and short,
allowing us to use the 1160g7-volteer device for more
performance-sensitive tasks.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32864>