On Alchemist, the FF_MODE2 documentation says that we must set the
FF_MODE2 timer values for GS and HS to 224. The hardware performance
tuning guide also recommends setting the TDS timer to 4.
On Tigerlake, i915 applies workarounds to set the GS timer to 224
(failing to do so can cause HS/DS unit hangs), and the TDS timer to 4
(for performance). It doesn't currently apply a HS timer there, and
I'm not sure if it's strictly necessary, but given that Alchemist
needed it, and the other two settings matched, let's assume that it
ought to match as well.
Unfortunately, there has been a bug in the i915 workarounds
infrastructure for non-masked context registers where writing one
field of the register zeroes out all the others. So, I believe the
Tigerlake TDS timer value of 4 isn't being applied correctly there,
though the register is also not readable on that platform which
makes it hard to verify. So, this may also speed up tessellation.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9233
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23839>
(cherry picked from commit 1b3669a1ed)
This enables L3 partial write merging for a number of cases that seem
to be getting accidentally disabled by the kernel, which was causing a
serious performance bottleneck on DG2 and MTL platforms. The
"Compressible Partial Write Merge Enable", "Coherent Partial Write
Merge Enable" and "Cross-Tile Partial Write Merge Enable" bits in
L3SQCREG5 were expected to be enabled by default (and confusingly,
they even read off as enabled if you ran 'intel_reg read 0xb158' on an
idle system), but they are getting clobbered during 3D context
initialization by an i915 workaround.
Enabling L3 partial write merging of compressible surfaces in
particular seems to increase rendering fillrate by over 3x in some
cases (e.g. the
"VulkanFillRate/FillRateGPU/resolution:1[0-3]/format:*/blend:0"
fillrate-bound microbenchmarks). Significant improvements can also be
reproduced in most real-world workloads we've tested so far,
e.g. Counter Strike GO improves by ~11%, Shadow Of the Tomb Raider
improves by ~5.5%, and AztecRuins-VK improves by ~6.5% on DG2-512 --
Thanks a lot to Caleb Callaway for these figures. No regressions have
been observed so far.
Even though this patch might strike as surprisingly simple for such a
large payoff, it's the result of Felix DeGrood and I trying to
root-cause the rendering performance gap of DG2 on Linux vs Windows on
and off during the last year, and some of the OA statistics captured
by Felix early this month were greatly helpful for me to connect the
last few dots, so Felix deserves a big chunk of the credit for this
work.
Cc: mesa-stable
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23783>
(cherry picked from commit 427fee3507)
A global aux-context can be created on-demand for cases where we need to
(for example) blit a resource when we only have a screen ptr.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23324>
(cherry picked from commit 75193262fd)
If we consider clearing the group flag of a vec4 register that is used as
source for some instruction we have to take into account that the parent
of the register element may also be part of a group in the parent instruction.
In this case we must not clear the group flag.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9118
Fixes: f3415cb26a (r600/sfn: copy propagate register load chains)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23813>
(cherry picked from commit 34163e19f7)
there are only 2 ubos that can be emitted, except the emitted ubos
can start at an offset based on the first-used ubo, which means this
has to support the full range of ubo indices
fixes oob access in game Beyond All Reason
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23729>
(cherry picked from commit 9803174942)
There is no need to split the CF if only the register ID
in a previous write is the same, we should look at the actual
slots instead, ut we have also to take writes of 0 and 1 into
account.
Cc: mesa-stable
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23710>
(cherry picked from commit 3a569fbf9b)
This reverts commit 0523607ebb.
The issue that commit worked around seems to have been fixed as of
commit 1c8b4940eb ("iris: Emit flushes for push constant source
buffers"). I could no longer reproduce it from that point onward with
this revert applied.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18725>
(cherry picked from commit bb6d300b3a)
This change fixes a buffer overflow by implementing the
special swizzles. This behavior is already available with
evergreen_convert_border_color().
For instance, this issue is triggered on a cayman gpu with
"piglit/bin/texwrap bordercolor -auto -fbo" or "piglit/bin/max-samplers -auto -fbo":
==5610==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000012d20 at pc 0x7fb798cb876f bp 0x7ffd78670460 sp 0x7ffd78670458
READ of size 4 at 0x603000012d20 thread T0
#0 0x7fb798cb876e in cayman_convert_border_color ../src/gallium/drivers/r600/evergreen_state.c:2444
#1 0x7fb798cb876e in evergreen_emit_sampler_states ../src/gallium/drivers/r600/evergreen_state.c:2539
#2 0x7fb7989e6cb2 in r600_emit_atom ../src/gallium/drivers/r600/r600_pipe.h:655
#3 0x7fb7989e6cb2 in r600_draw_vbo ../src/gallium/drivers/r600/r600_state_common.c:2333
#4 0x7fb7985082c7 in u_vbuf_draw_vbo ../src/gallium/auxiliary/util/u_vbuf.c:1497
#5 0x7fb796ef2eda in cso_draw_vbo ../src/gallium/auxiliary/cso_cache/cso_context.h:262
#6 0x7fb796ef2eda in st_draw_gallium_multimode ../src/mesa/state_tracker/st_draw.c:170
#7 0x7fb7970d9cfd in vbo_exec_vtx_flush ../src/mesa/vbo/vbo_exec_draw.c:341
#8 0x7fb7970d32d7 in vbo_exec_FlushVertices_internal ../src/mesa/vbo/vbo_exec_api.c:693
#9 0x7fb7970d32d7 in vbo_exec_FlushVertices ../src/mesa/vbo/vbo_exec_api.c:1193
#10 0x7fb7975f237c in enable_texture ../src/mesa/main/enable.c:337
Fixes: 923d635357 ("r600: fix some border color swizzles on CAYMAN")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23435>
(cherry picked from commit 4284705733)
Indeed, this function was not processing the linked
allocated list.
For instance, this issue is triggered with "piglit/bin/hiz-depth-read-fbo-d24-s0 -auto":
Indirect leak of 40 byte(s) in 1 object(s) allocated from:
#0 0x7f6795638987 in calloc (/usr/lib64/libasan.so.6+0xb1987)
#1 0x7f678bac13b9 in nouveau_heap_alloc ../src/gallium/drivers/nouveau/nouveau_heap.c:64
#2 0x7f678bb6c7e4 in nv50_program_upload_code ../src/gallium/drivers/nouveau/nv50/nv50_program.c:490
#3 0x7f678bb83b92 in nv50_vertprog_validate ../src/gallium/drivers/nouveau/nv50/nv50_shader_state.c:161
#4 0x7f678bba3000 in nv50_state_validate ../src/gallium/drivers/nouveau/nv50/nv50_state_validate.c:552
#5 0x7f678bba3c4d in nv50_state_validate_3d ../src/gallium/drivers/nouveau/nv50/nv50_state_validate.c:575
#6 0x7f678b9e3e92 in nv50_blit_3d ../src/gallium/drivers/nouveau/nv50/nv50_surface.c:1444
#7 0x7f678b9e3e92 in nv50_blit ../src/gallium/drivers/nouveau/nv50/nv50_surface.c:1832
#8 0x7f678a0b378a in blit_to_staging ../src/mesa/state_tracker/st_cb_readpixels.c:337
#9 0x7f678a0b7358 in st_ReadPixels ../src/mesa/state_tracker/st_cb_readpixels.c:516
#10 0x7f6789f82005 in read_pixels ../src/mesa/main/readpix.c:1178
#11 0x7f6789f82005 in _mesa_ReadnPixelsARB ../src/mesa/main/readpix.c:1195
#12 0x7f6789f82ac0 in _mesa_ReadPixels ../src/mesa/main/readpix.c:1210
...
SUMMARY: AddressSanitizer: 80 byte(s) leaked in 2 allocation(s).
Fixes: 67635a0a71 ("nouveau: get rid of tabs")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23592>
(cherry picked from commit 1980934d0d)
ARB_occlusion_query specifies that the query is reset on BeginQueryARB,
not when the fetching the result of the query. This behavior also makes
a lot of sense for the perfmon queries.
CC: mesa-stable
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23557>
(cherry picked from commit b6a4b988ab)
When the RS uses the pixel pipes it seems to destroy/invalidate any
content sitting in the color and depth caches from a previous draw.
Always flush the color and depth cache before using the RS to make
sure that any cache content written by the PE is properly flushed
to memory.
Fixes spec@!opengl 1.0@gl-1.0-drawpixels-depth-test and probably a
few others that are suffering from corruption of PE writes.
CC: mesa-stable
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23530>
(cherry picked from commit b1cd5780d6)
Move the TS cache flush into the same conditional block where
the TS setup is changed. TS cache always needs to be flushed
before making any changes to the TS setup.
CC: mesa-stable
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23530>
(cherry picked from commit cfc1be9590)
conditional render is only supposed to be enabled during renderpasses,
and this ends up doing mismatched start/stop in and out of renderpasses
affects:
GTF-GL46.gtf30.GL3Tests.conditional_render*
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23511>
(cherry picked from commit 6aa9e95021)
DS_XCHG_RET and LDS_CMP_XCHG_RET don't have a version that doesn't return
a value in the LDS red queue. so we have to read the value from the queue
and discard it.
Fixes: 79ca456b (r600/sfn: rewrite NIR backend)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23518>
(cherry picked from commit 976d6de232)
GPUs without native integers lower idiv in lower_int_to_float and
there is no need to call nir_lower_idiv(..) for such GPUs.
Fixes nir crashes I am seeing with gc2000_gles2 CI job.
Fixes: f532202f2d ("etnaviv: use nir_lower_idiv(..) before opt loop")
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23450>
(cherry picked from commit add14d6cfb)
Instead of incrementing the seqno, just directly invalidate any existing
tex cache entries when we update a sampler view.
No reason not to just directly clear stale entries, and avoiding to re-
assign the seqno will simplify the next patch.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23470>
(cherry picked from commit 3f00f4ac30)
The previous implementation was copying the data using the
aligned length (size_dw). The aligned length could overflow
the original buffer size.
For instance, this issue is triggered with "piglit/bin/draw-batch -auto -fbo":
==5736==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fff139c77e8 at pc 0x7f25b350a9a0 bp 0x7fff139c6cb0 sp 0x7fff139c6460
READ of size 8 at 0x7fff139c77e8 thread T0
#0 0x7f25b350a99f in __interceptor_memcpy (/usr/lib64/libasan.so.6+0x3c99f)
#1 0x7f25a8fcdf24 in radeon_emit_array ../src/gallium/include/winsys/radeon_winsys.h:760
#2 0x7f25a8fcdf24 in r600_draw_vbo ../src/gallium/drivers/r600/r600_state_common.c:2448
#3 0x7f25a8ae7ba1 in u_vbuf_draw_vbo ../src/gallium/auxiliary/util/u_vbuf.c:1791
#4 0x7f25a7bc18ca in _mesa_validated_drawrangeelements ../src/mesa/main/draw.c:1696
#5 0x7f25a7bc7e53 in _mesa_DrawElements ../src/mesa/main/draw.c:1824
Fixes: 0cf5d1f226 ("gallium: remove PIPE_CAP_INFO_START_WITH_USER_INDICES and fix all drivers")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23436>
(cherry picked from commit 340311dac9)
Fix defect reported by Coverity Scan.
Resource leak (RESOURCE_LEAK)
leaked_storage: Variable memobj going out of scope leaks the storage it points to.
Fixes: a157133380 ("nvc0/nv50: support and enable EXT_memory_object*")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23417>
(cherry picked from commit bfb092b955)
Fix defect reported by Coverity Scan.
Resource leak (RESOURCE_LEAK)
leaked_storage: Variable fd6_ctx going out of scope leaks the storage it points to.
Fixes: de3b34df97 ("freedreno: Add a6xx backend")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23289>
(cherry picked from commit 7637fba452)
As Matt Turner pointed out, the commit here fixed breaks in Iris and
ANV in kernel versions without support for DRM_I915_QUERY_ENGINE_INFO.
As compute engines are only present in gfx12 and newer, and support
for DRM_I915_QUERY_ENGINE_INFO was added before any gfx12 platform,
we can check for gfx version before trying to get engine info.
For ANV, this is done by checking if engine_info is not NULL, like in
other places in the ANV source code.
Fixes: a364f23a6c ("intel: Make gen12 URB space reservation dependent on compute engine presence")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9099
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Tested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23257>
(cherry picked from commit 42f707e459)
If we run out of space in the commandstream while emitting the
current state the drived states won't be recomputed for the
now fully dirty context as the state derivation is called before
any state emitted. Fix this by explicitly updating the derived
state after a forced flush.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8916
CC: mesa-stable
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23329>
(cherry picked from commit fbd37e6168)
The GS and the FS needs to agree on the driver_location. But we just
used the num_outputs variable for the GS instead of matching the logic
from lower_aaline_instr in nir_draw_helpers.c.
This does that, but cleans up our copy slightly to avoid computing the
needless location, as well as using unsigned values.
This used to *mostly* work before, but only because we were lucky and
not too much crazy stuff went on with the inputs / outputs in
smooth-line cases.
Fixes: edecb66b01 ("nir: avoid generating conflicting output variables")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23316>
(cherry picked from commit ffc77d5262)
We don't need to add an offset in the buffer, because we submit
the offset where the data was written to to the host. The
correction of this offset is also not needed and results in draw
errors.
Fixes: 0cf5d1f226
gallium: remove PIPE_CAP_INFO_START_WITH_USER_INDICES and fix all drivers
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23196>
(cherry picked from commit d4fc359748)
We're currently aligning the offset to the size of the data structure
itself when the upload manager actually expects a POT. Ideally this
would be the next POT that's greater than the size of the structure.
Fixes: c24a574e6c ("iris: Don't allocate a BO per query object")
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20153>
(cherry picked from commit 14dec0c147)