ret was read after the timeout check, so breaking on timeout returned 0
instead of the actual fence status, potentially reporting a signaled
fence when it was still pending.
Fixes: 441f01e778 ("freedreno/drm/virtio: Drop blocking in host")
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41108>
Fixes two bugs in the WAIT_FENCE polling loop:
1. Break on timeout returned VK_SUCCESS because ret was read too late.
2. UINT64_MAX timeout_ns overflowed end_time, causing immediate exit.
Fix by reading rsp->ret before the timeout check and using
OS_TIMEOUT_INFINITE (like virtio_pipe_wait in freedreno) to avoid
overflow.
This prevents premature BO teardown during host-side fault recovery.
Fixes: f17c5297d7 ("tu: Add virtgpu support")
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41108>
Enable protected context capability for Android
when TMZ support is available. This is needed for Widevine L1 secure
video playback on Android, which requires a protected context.
Signed-off-by: jinmiliu <jinming.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40980>
V_581B_PFP and V_581B_ME is for pws acquire_mem. Current code
does not cause any problem because we won't pass engine arg
directly to acqure_mem packet. But use a native V_581A_* arg
for better coding.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41069>
New IB print will assert reserved packet field to be zero.
Fixes: 1c75cd958f ("ac: enable the new auto-generated CP packet parser")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41069>
../src/intel/vulkan/genX_init_state.c: In function ‘gfx9_CreateSampler’:
../src/intel/vulkan/genX_init_state.c:1507:40: warning: ‘border_color_offset’ may be used uninitialized [-Wmaybe-uninitialized]
1507 | sampler_state.BorderColorPointer = border_color_offset;
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41116>
By default crocus precompiles shaders, to avoid stuttering at screens,
caused by compiling shaders at the drawing phase.
Unfortunately at intel Gen 6 and higher the precompiled version of the
fragment shaders is not used and every fragment shader is compiled twice.
These double fragment shaders also are added to the memory cache
and disk cache.
This is caused by setting wrong values to variables at the key during
precompiling at routine crocus_create_fs_state() at src/gallium/drivers/crocus/crocus_program.c,
which differ from values at crocus_populate_fs_key() at src/gallium/drivers/crocus/crocus_state.c.
This commit solves 3 problems:
it adjusts the predicted value 'input_slots_valid' at Gen 6
it adjusts the predicted value 'ignore_sample_mask_out' at Gen 6 and higher
it predicts the value 'multisample_fbo' , which helps if samplemask is used
Cc: mesa-stable
Signed-off-by: GKraats <vd.kraats@hccnet.nl>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35605>
Format is set to ISL_FORMAT_UNSUPPORTED at anv_get_format_plane at src/intel/vulkan_hasvk/anv_formats.c,
because Ivy Bridge does not support enough 24 and 48-bits formats.
Problem solved by checking format after the call.
Signed-off-by: GKraats <vd.kraats@hccnet.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40237>
This covers the DX8/DX9 single-frame apitrace collection from
traces-db-private, and the job will appear for anyone in the group with
access to restricted traces. Like other restricted traces jobs, it's set
to allow-failure, so that regressions in the job from changes by
developers not in the group don't block merging by developers with access,
but hopefully the increased visibility lets us catch rendering bugs faster
or avoid merging them in the first place.
The actual runtime for all of our dx8/9 trace collection is about 2:30,
and the whole job is about 7:30.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40959>
The new tool has much better image diffing presentation (thanks to
Danilo's work on turnip's private trace CI), better performance, flake
checking within a single run, parallelized downloads along with replays,
and ability to cache downloaded files to improve runtime, and system
monitoring (for debugging OOM-related slowdowns).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40959>
The new tool has much better image diffing presentation (thanks to
Danilo's work on turnip's private trace CI), better performance, flake
checking within a single run, parallelized downloads along with replays,
and ability to cache downloaded files to improve runtime, and system
monitoring (for debugging OOM-related slowdowns).
./bin/update_traces_checksum.sh still updates based on the output of a CI
run, but you can also apply a patch file that the tool generates, if you
do offline runs using your traces.toml.
New traces being replayed, in less overall runtime (2 minutes instead of 3):
- minetest/minetest-high-v3.trace (new version, not the old flaky one)
- neverball/neverball-v2.trace
- ror/ror-default.trace
- supertuxkart/supertuxkart-mansion-egl-gles-v2.b.trace
- valve/counterstrike-v2.trace
- valve/portal-2-v2.trace
- xonotic/xonotic-keybench-high-v2.trace
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40959>
The hangover DXVK builds we want to use for arm64 CI hit this path, and we
have a perfectly reasonable fallback for handling this case (ignore the
sampler, as glslang should have done).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40959>
This helper might be used as by another instruction emission,
which itself might have set the saturate bit in the default
state. This might result in the SYNC being created already
with saturate bit set.
Since SYNC doesn't have saturate, clear that field
instead of sometimes having it set.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41005>
PANVK_DEBUG_HSR_PREPASS and PANVK_DEBUG_NO_EXTENDED_VA_RANGE have the
same value, meaning they both get toggled when one is.
This commit moves PANVK_DEBUG_HSR_PREPASS to the following value.
Fixes: 2d9be41706 ("panvk/v13: Support HSR Prepass")
Reviewed-by: John Anthony <john.anthony@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41106>
The disassembly file had a lot of inconsitencies in indentation, so
align on the standard IndentWidth: 3
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41062>
The current implementation prints FAU entries as 32-bit entries. While
this works, it does not align with the DDK.
Rather than treating FAU as a set of 32-bit entries, treat is as 64-bit
entries that can be split in two words.
This aligns with the DDK and has allows for differentiating 32-bit and
64-bit reads based on whether a word modifier is used.
Finally, add entry values to FAU printing to easily look up specific
reads.
For example:
Vertex FAU @ffd93950:
43000000 43000000
3F800000 43000000
43000000 00000000
C7000000 47000000
00000001 00000000
FMAX.f32 r3, r3^, u6
FMIN.f32 r3, r3^, u7
vs
Vertex FAU @ffd93950:
u0 43000000 43000000
u1 3F800000 43000000
u2 43000000 00000000
u3 C7000000 47000000
u4 00000001 00000000
FMAX.f32 r3, r3^, u3.w0
FMIN.f32 r3, r3^, u3.w1
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41062>
This makes it clear that both registers are read/written, and aligns
with DDK disassembly.
For example:
STORE.i128.istream.slot2.reconverge @r0:r1:r2:r3, r4^, offset:0
vs
STORE.i128.istream.slot2.reconverge @r0:r1:r2:r3, [r4^:r5^], offset:0
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41062>
The previous fix was incomplete because if the same graphics pipeline
and the same PS epilog are rebind after vkCmdExecuteCommands(), the PS
epilog state wouldn't be re-emitted, and it will use a wrong VA (in case
both fragment shader user SGPRs aren't similar either).
Resetting the PS epilog to NULL in the primary should prevent any
issues, but this tracking still need to be improved because it caused
two issues recently.
Fixes: 1a00587c44 ("radv: fix a GPU hang with PS epilogs and secondary command buffers")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15176
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41056>
With SPV_KHR_constant_data, it's allowed to specialize array of
constants.
RustiCL changes are from Karol Herbst <kherbst@redhat.com>.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41046>
This patch enables dynamic stack ID control on Xe3+.
Programmed values are the recommended settings from the Bspec.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41066>
When lowering cf we go out of SSA which translates phis into reg
intrinsics. However when converting them back to SSA, initially single
source phis now have an undef source leading to increased register
pressure on the NAK side. This also hinders copy propagation as it's not
designed to handle sources through phis yet.
Totals from 50621 (4.17% of 1212873) affected shaders:
CodeSize: 1605273744 -> 1621029728 (+0.98%); split: -0.34%, +1.32%
Number of GPRs: 4673586 -> 4067935 (-12.96%); split: -12.97%, +0.01%
SLM Size: 263428 -> 258176 (-1.99%)
Static cycle count: 2599838439 -> 2586392435 (-0.52%); split: -1.11%, +0.59%
Spills to memory: 23512 -> 15527 (-33.96%)
Fills from memory: 23512 -> 15527 (-33.96%)
Spills to reg: 64590 -> 57328 (-11.24%); split: -13.83%, +2.58%
Fills from reg: 55559 -> 44319 (-20.23%); split: -22.66%, +2.42%
Max warps/SM: 1189396 -> 1347600 (+13.30%)
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41042>
For DG2 (Bspec 47937) has the same programming note as of Xe2+,
"When this bit is set in the header, Trace Ray Message behaves like a
Ray Query. This message requires a write-back message indicating
RayQuery for all valid Rays (SIMD lanes) have completed."
So this patch is just passing a write back destination register when we
have ray query message.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41039>