We don't want huge textures to take up large amounts of system memory.
On a system with 1GB of RAM, this would limit us to 256 MB, which is the
same memory usage as a texture of 8192 x 8192 with 4 bytes per
component, which is what we currently limit max texture size to anyway.
The goal here is really to allow using larger textures on systems with
more memory, but that bit comes in a later patch in this series.
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32866>
We're doing a better job at selecting the tiler hierarchy mask in PanVK,
so let's move that to common code and reuse it for the Gallium driver as
well.
The logic to disable the first level for large tile-sizes has been left
at the call-sites, because this is specific to V10 GPUs and later, so it
doesn't apply to the JM code-paths.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32866>
We can handle invalidation of the FBO attachments at job
creation. It solves that we were skipping invalidations in jobs
that had been created by a clear call, as before this change
invalidations were only taken into account the first draw calls
of the job. In these cases where there is a clear after FB
invalidation the resource attachment was tracked as invalidated
for more time than expected. So the stores of the job with the
clear were not being loaded by the next job attaching it because
of the not correct application of the invalidation.
Fixes: 6c46890325 ("v3d: avoid load/store of tile buffer on invalidated framebuffer")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12456
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33148>
There are no calls anymore. It's just a loop with a switch now, which should
be implemented as a jump table by the compiler. The only calls are to
the pipe_context functions.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33087>
gcc should generate a jump table for the switch, so it should be faster than
indirect calls after we inline the calls in the next commit.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33087>
When a 2D_ARRAY view of a 3D image is created, the layer count of the 2D
view should be based on the depth of the 3D image. When
VK_REMAINING_ARRAY_LAYERS is used, we were incorrectly calculating them
from the layer count of the base image.
Fixes future tests: dEQP-VK.renderpass*.remaining_array_layers.*
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33075>
On the last iteration of the loop, `offset` will point to the location
just beyond the last instruction in the program. If the program exactly
fills `p->store` then calling `next_offset()` will read out of bounds.
Instead just let the inner while loop call `next_offset()` one
additional time.
Fixes: a35b9cb625 ("i965: Add annotation data structure and support code.")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12486
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33101>
On the last iteration of the loop, `offset` will point to the location
just beyond the last instruction in the program. If the program exactly
fills `p->store` then calling `next_offset()` will read out of bounds.
Instead just let the inner while loop call `next_offset()` one
additional time.
Fixes: a35b9cb625 ("i965: Add annotation data structure and support code.")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12486
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33101>
Another drm master may have previously set these to non-zero values, which
can change the image in undesired ways.
Signed-off-by: Xaver Hugl <xaver.hugl@kde.org>
Backport-to: 24.3
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32670>
On AMD, this is a clear win. 2.0 is a free constant,
the multiplication can be fused into fma, or it can
be done as a free output modifier. Otherwise, fmul
and fadd have the same throughput/latency, with the only
possible downside being potentially power consumption.
For other hardware this might not be as clear,
but we should at least choose one option for NIR because
it allows more CSE.
Foz-DB Navi21:
Totals from 12231 (15.41% of 79395) affected shaders:
MaxWaves: 309068 -> 309032 (-0.01%)
Instrs: 11826395 -> 11790132 (-0.31%); split: -0.31%, +0.00%
CodeSize: 63531496 -> 63512868 (-0.03%); split: -0.03%, +0.00%
VGPRs: 551256 -> 551328 (+0.01%); split: -0.00%, +0.02%
SpillSGPRs: 984 -> 979 (-0.51%)
Latency: 88486492 -> 88394296 (-0.10%); split: -0.11%, +0.01%
InvThroughput: 22360595 -> 22300790 (-0.27%); split: -0.27%, +0.00%
VClause: 226267 -> 226273 (+0.00%); split: -0.01%, +0.01%
SClause: 293820 -> 293783 (-0.01%); split: -0.02%, +0.00%
Copies: 727187 -> 727106 (-0.01%); split: -0.03%, +0.02%
PreSGPRs: 539623 -> 539625 (+0.00%)
PreVGPRs: 440843 -> 440946 (+0.02%); split: -0.00%, +0.03%
VALU: 8324962 -> 8288809 (-0.43%); split: -0.43%, +0.00%
SALU: 1278550 -> 1278538 (-0.00%); split: -0.00%, +0.00%
VMEM: 440600 -> 440599 (-0.00%)
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32989>
The drm device file is closed and re-opened between perfetto traces. So
we either need to re-create the fd_device, or use fd_device_new_dup() so
we hold on to our own fd. The former is preferred so that the kernel
can realize when we are no longer reading the perfcntrs.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33073>
Requests with no reply will report errors by default to the event
loop, which then usually cause the not very useful log like this
to be printed:
X Error of failed request: BadAlloc (insufficient resources for operation)
Major opcode of failed request: 149 ()
Minor opcode of failed request: 2
Serial number of failed request: 33
Current serial number in output stream: 34
This commit introduce some helpers to handle the xcb errors in Mesa,
and be able to report errors properly.
For instance the same error will now log:
MESA: error: dri3_alloc_render_buffer:1634 xcb_dri3_pixmap_from_buffer[s] failed
MESA: error: X error: 11
It's not fixing the underlying issue, but at least now tests like
"glx-visuals-stencil -pixmap" and "glx-visuals-depth pixmap" fail
properly.
cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33036>
* GLES3.x is only valid for x <= 2
* The expected error is GLXBadProfileARB, not BadValue
cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33036>
This allows us to drop the reboot condition on SaharaMode which means
that every time the reboot pattern is hit, it is due to a test
execution issue (GPU hang) and not something unrelated to the job.
Additionally, this allows us to try booting up to 5 times which should
help boot reliability.
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Reviewed-by: Eric Engestrom <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32572>
This will be helpful for DUTs that do not boot reliably due to firmware
issue as it will allow us to specify a boot retry count that only
counts towards booting linux and not other issues found during test
execution (like hitting a GPU hang).
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Reviewed-by: Eric Engestrom <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32572>
Gitlab's runner ID is an integer while CI-Tron's is a string. Let's use
the runner description in the job description since this is what
CI-Tron expects.
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Reviewed-by: Eric Engestrom <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32572>
GDS is gone on GFX12 and generated/written primitives queries need to
be emulated using global atomics. This new user SGPR will be used to
pass the 32-bit VA of the shader query buffer.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33041>
The original implementation based on RadeonSI was broken for
pause/resume and for indirect draws with a counter buffer basically.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33017>