submit_count is used to disambiguate a batch_id based on the generation
id of a given batch: this value is incremented once on submit and once on
reset such that the diff of the values is > 1 any time the batch does not
represent the fence it was last submitted with
in the case of a batch's first use, however, this value was being incorrectly
incremented such that the first submit would cause disambiguation checks
to erroneously determine that the batch had already completed, breaking synchronization
fixes#9313
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24016>
(cherry picked from commit 3c520892b1)
pipe_picture_desc.decrypt_key was alloced in function
handleVAProtectedSliceDataBufferType(), but nowhere to
free it. Now, it will be freed as the vlVaContext is
destroyed.
Fixes: deb7dc82f6 ("frontends/va: handle protected slice data buffer")
Signed-off-by: Feng Jiang <jiangfeng@kylinos.cn>
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23202>
(cherry picked from commit 9790350e9d)
This issue is mainly a consequence of a call to util_blitter_clear()
with unnecessary blitter states, these states are never freed.
This change is inspired from radeonsi and r600.
Note: PAN_SAVE_FRAGMENT_STATE is added and always enabled
at this stage.
For instance, this issue is triggered on Mali-T720 with
"piglit/bin/fcc-read-after-clear sample tex -auto -fbo", "piglit/bin/cubemap -auto"
and "piglit/bin/fbo-srgb -auto" or on Mali-T820 with "piglit/bin/longprim -auto -fbo"
and "piglit/bin/ext_render_snorm-render -auto -fbo"
while setting GALLIUM_REFCNT_LOG=refcnt.log.
cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22522>
(cherry picked from commit 689f38b2b4)
Kopper requires that any depth/stencil images are created through winsys which
was not taken into account by the WGL frontend causing it to hit an assert:
'Assertion failed: ctx->fb_state.zsbuf->texture->bind & PIPE_BIND_DISPLAY_TARGET'
fixes#9256
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24055>
(cherry picked from commit d3662ba461)
Instead of manually indexing a single-dimensional array as 2-dimensional
(and using the wrong stride for the outer array) just actually make it
a 2-dimensional array.
Fixes: 7edae456 ("d3d12: Track up to 16 contexts worth of batch references locally in bos")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24041>
(cherry picked from commit a6740ee7a4)
On Pre-Evergreen hardware we have a flag
"Force Clamp X,Y policy to wrap for CubeMaps"
but it doesn't seem to affect how border clamping is done. With
bf3027 this is set to PIPE_TEX_WRAP_CLAMP_TO_EDGE for cube maps,
and results in the regression reported in #9028.
Forcing repeat mode fixes the issue.
Fixes: bf3027c391
mesa/st: Normalize wrap modes for seamless cubes
Related: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9028
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23848>
(cherry picked from commit 91fa1970c9)
Most users of nir_foreach_function actually want the nir_function_impl, not the
nir_function, and want to skip empty functions (though some graphics-specific
passes sometimes fail to do that part). Add a nir_foreach_function_impl macro
to make that case more ergonomic.
nir_foreach_function_impl(impl, shader) {
...
foo(impl)
}
is equivalent to:
nir_foreach_function(func, shader) {
if (func->impl) {
...
foo(func->impl);
}
}
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23807>
(cherry picked from commit 19daa9283c)
Currently, it's always initialized to 0, but we should take the value from
the grouping passed to the macro. This way parser will have the full
location info, and errors originating from it will show the correct
source file number.
Fixes: a0cfe8c4 ("glsl: Fix missing initialization of yylloc.source")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9229
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23966>
(cherry picked from commit 791785c2b4)
Indeed, util_blitter_clear() requires a call to
util_blitter_save_fragment_constant_buffer_slot(),
but most other blitter functions do not.
For instance, this issue is triggered with:
"piglit/bin/object-namespace-pollution glDrawPixels buffer -auto -fbo"
while setting GALLIUM_REFCNT_LOG=refcnt.log.
Fixes: 03bc7503d4 ("radeonsi: save the fs constant buffer to the util blitter context")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23856>
(cherry picked from commit 80ccc3f822)
If an error occured, make sure we reset the scanout object before
leaving, otherwise the next user of this handle will hit the
refcnt == 0 assert.
Fixes: ad4d7ca833 ("kmsro: Fix renderonly_scanout BO aliasing")
Cc: mesa-stable
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23746>
(cherry picked from commit 7a0033a1c9)
If an error occured, make sure we reset the scanout object before
leaving, otherwise the next user of this handle will hit the
refcnt == 0 assert.
Fixes: ad4d7ca833 ("kmsro: Fix renderonly_scanout BO aliasing")
Cc: mesa-stable
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23746>
(cherry picked from commit 45a27adc3b)
Use ro->bo_map to alloc scanout and make sure we initialize the refcnt
to one.
This fixes leaking the scanout object and the underlying dumb-buffer.
Fixes: ad4d7ca833 ("kmsro: Fix renderonly_scanout BO aliasing")
Cc: mesa-stable
Signed-off-by: Robert Beckett <bob.beckett@collabora.com>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23746>
(cherry picked from commit 8087f784e4)
Sets the float color component type in st_visual_to_context_mode()
ensuring float color values are not clamped.
Fixes dEQP-EGL.functional.wide_color.window_fp16_default_colorspace on
asahi, iris and most likely every other driver having it marked as fail
or flake.
Closes: mesa/mesa#9276
Signed-off-by: Janne Grunau <j@jannau.net>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23914>
(cherry picked from commit fd4d0e1cc2)
There is no mention in spec about subtract one of the number of
threads, also Iris and blorp code don't subtract.
Alchemist PRMs: Volume 2a: Command Reference: Instructions: CFE_STATE: Maximum Number of Threads:
Normally set to the maximum number of threads: (# EUs) * (# threads/EU)
Cc: mesa-stable
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23973>
(cherry picked from commit c142736f52)
If a then/else block ends in a jump, the phi nodes do not necessarily
have to reference the always taken branch because they are dead code.
Avoid crashing in this case by only rewriting phis, if the block does
not end in a jump.
cc: mesa-stable
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23150>
(cherry picked from commit e379b9ad8c)
1. Always calculate when asked. This is the sort of optimization that just
introduces bugs. Like one I hit when shuffling register indices around with
the register access changes.
2. Ask before using in RA.
3. Account for precoloured blend inputs.
Small shader-db hit, didn't investigate too much.
total instructions in shared programs: 1518017 -> 1518168 (<.01%)
instructions in affected programs: 2895 -> 3046 (5.22%)
helped: 0
HURT: 24
Instructions are HURT.
total bundles in shared programs: 646756 -> 646782 (<.01%)
bundles in affected programs: 1119 -> 1145 (2.32%)
helped: 1
HURT: 19
Bundles are HURT.
total quadwords in shared programs: 1133694 -> 1133728 (<.01%)
quadwords in affected programs: 1736 -> 1770 (1.96%)
helped: 0
HURT: 20
Quadwords are HURT.
total registers in shared programs: 90596 -> 90612 (0.02%)
registers in affected programs: 108 -> 124 (14.81%)
helped: 0
HURT: 16
Registers are HURT.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Cc: mesa-stable
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23769>
(cherry picked from commit 23010acc10)
this fixes the refcount for the separate shader program to not have a leaked ref
and then fixes the owned program to have the expected number of refs
this happened to work some of the time before because there was an arbitrary unref
in replace_separable_prog(), but this shouldn't have been necessary
Fixes: e3b746e3a3 ("zink: use GPL to handle (simple) separate shader objects")
fixes#9274
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23888>
(cherry picked from commit 4e38061643)
I've recently learned that this is necessary to get
correct results from divergence analysis.
No Fossil DB stats changes on GFX10.3.
Note, when backporting this patch to stable, make sure
the call to nir_convert_to_lcssa is before nir_divergence_analysis,
which may be located in a different place in the stable branch.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23946>
(cherry picked from commit ddeabcc19b)
Primitives generated queries write 1 integer, the primitives-generated
count that is incremented every time a primitive emitted to that stream
reaches the transform feedback stage.
Fixes: 1ebf463a5a ("radv: implement VK_EXT_primitives_generated_query")
Signed-off-by: Juston Li <justonli@google.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Joshua Ashton <joshua@froggi.es>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23915>
(cherry picked from commit 33ee59af1d)
The introduction of a workaround adding lots of MI_NOOPs broke our
computation.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: b9aa66d5d0 ("anv: disable preemption for 3DPRIMITIVE during streamout")
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23792>
(cherry picked from commit a13ac83f1b)
On Alchemist, the FF_MODE2 documentation says that we must set the
FF_MODE2 timer values for GS and HS to 224. The hardware performance
tuning guide also recommends setting the TDS timer to 4.
On Tigerlake, i915 applies workarounds to set the GS timer to 224
(failing to do so can cause HS/DS unit hangs), and the TDS timer to 4
(for performance). It doesn't currently apply a HS timer there, and
I'm not sure if it's strictly necessary, but given that Alchemist
needed it, and the other two settings matched, let's assume that it
ought to match as well.
Unfortunately, there has been a bug in the i915 workarounds
infrastructure for non-masked context registers where writing one
field of the register zeroes out all the others. So, I believe the
Tigerlake TDS timer value of 4 isn't being applied correctly there,
though the register is also not readable on that platform which
makes it hard to verify. So, this may also speed up tessellation.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9233
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23839>
(cherry picked from commit 1b3669a1ed)
This enables L3 partial write merging for a number of cases that seem
to be getting accidentally disabled by the kernel, which was causing a
serious performance bottleneck on DG2 and MTL platforms. The
"Compressible Partial Write Merge Enable", "Coherent Partial Write
Merge Enable" and "Cross-Tile Partial Write Merge Enable" bits in
L3SQCREG5 were expected to be enabled by default (and confusingly,
they even read off as enabled if you ran 'intel_reg read 0xb158' on an
idle system), but they are getting clobbered during 3D context
initialization by an i915 workaround.
Enabling L3 partial write merging of compressible surfaces in
particular seems to increase rendering fillrate by over 3x in some
cases (e.g. the
"VulkanFillRate/FillRateGPU/resolution:1[0-3]/format:*/blend:0"
fillrate-bound microbenchmarks). Significant improvements can also be
reproduced in most real-world workloads we've tested so far,
e.g. Counter Strike GO improves by ~11%, Shadow Of the Tomb Raider
improves by ~5.5%, and AztecRuins-VK improves by ~6.5% on DG2-512 --
Thanks a lot to Caleb Callaway for these figures. No regressions have
been observed so far.
Even though this patch might strike as surprisingly simple for such a
large payoff, it's the result of Felix DeGrood and I trying to
root-cause the rendering performance gap of DG2 on Linux vs Windows on
and off during the last year, and some of the OA statistics captured
by Felix early this month were greatly helpful for me to connect the
last few dots, so Felix deserves a big chunk of the credit for this
work.
Cc: mesa-stable
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23783>
(cherry picked from commit 427fee3507)