This covers the DX8/DX9 single-frame apitrace collection from
traces-db-private, and the job will appear for anyone in the group with
access to restricted traces. Like other restricted traces jobs, it's set
to allow-failure, so that regressions in the job from changes by
developers not in the group don't block merging by developers with access,
but hopefully the increased visibility lets us catch rendering bugs faster
or avoid merging them in the first place.
The actual runtime for all of our dx8/9 trace collection is about 2:30,
and the whole job is about 7:30.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40959>
The new tool has much better image diffing presentation (thanks to
Danilo's work on turnip's private trace CI), better performance, flake
checking within a single run, parallelized downloads along with replays,
and ability to cache downloaded files to improve runtime, and system
monitoring (for debugging OOM-related slowdowns).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40959>
The new tool has much better image diffing presentation (thanks to
Danilo's work on turnip's private trace CI), better performance, flake
checking within a single run, parallelized downloads along with replays,
and ability to cache downloaded files to improve runtime, and system
monitoring (for debugging OOM-related slowdowns).
./bin/update_traces_checksum.sh still updates based on the output of a CI
run, but you can also apply a patch file that the tool generates, if you
do offline runs using your traces.toml.
New traces being replayed, in less overall runtime (2 minutes instead of 3):
- minetest/minetest-high-v3.trace (new version, not the old flaky one)
- neverball/neverball-v2.trace
- ror/ror-default.trace
- supertuxkart/supertuxkart-mansion-egl-gles-v2.b.trace
- valve/counterstrike-v2.trace
- valve/portal-2-v2.trace
- xonotic/xonotic-keybench-high-v2.trace
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40959>
The build script is just copy and paste of deqp-runner's.
This will be used to replace piglit's trace replay (and I have plans for
better gitlab CI-based performance testing as well)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40959>
The wine prefix is dropped from build-vkd3d-proton, where it's not needed
(no remaining references in the tree). We do set up a /wineprefix (as a
more obvious name) in the wine installation, and include /usr/lib/*/wine
in test-vk container images and in a tarball uploaded as a LAVA rootfs
overlay.
With this, one should be able to run "wine" successfully.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40959>
This has improvements to snapshots for looping that I'll be using for the
new trace replay tool, and supports zstd trace compression (which we're
using in traces-db/traces-db-private now).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40959>
The hangover DXVK builds we want to use for arm64 CI hit this path, and we
have a perfectly reasonable fallback for handling this case (ignore the
sampler, as glslang should have done).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40959>
This helper might be used as by another instruction emission,
which itself might have set the saturate bit in the default
state. This might result in the SYNC being created already
with saturate bit set.
Since SYNC doesn't have saturate, clear that field
instead of sometimes having it set.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41005>
PANVK_DEBUG_HSR_PREPASS and PANVK_DEBUG_NO_EXTENDED_VA_RANGE have the
same value, meaning they both get toggled when one is.
This commit moves PANVK_DEBUG_HSR_PREPASS to the following value.
Fixes: 2d9be41706 ("panvk/v13: Support HSR Prepass")
Reviewed-by: John Anthony <john.anthony@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41106>
The disassembly file had a lot of inconsitencies in indentation, so
align on the standard IndentWidth: 3
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41062>
The current implementation prints FAU entries as 32-bit entries. While
this works, it does not align with the DDK.
Rather than treating FAU as a set of 32-bit entries, treat is as 64-bit
entries that can be split in two words.
This aligns with the DDK and has allows for differentiating 32-bit and
64-bit reads based on whether a word modifier is used.
Finally, add entry values to FAU printing to easily look up specific
reads.
For example:
Vertex FAU @ffd93950:
43000000 43000000
3F800000 43000000
43000000 00000000
C7000000 47000000
00000001 00000000
FMAX.f32 r3, r3^, u6
FMIN.f32 r3, r3^, u7
vs
Vertex FAU @ffd93950:
u0 43000000 43000000
u1 3F800000 43000000
u2 43000000 00000000
u3 C7000000 47000000
u4 00000001 00000000
FMAX.f32 r3, r3^, u3.w0
FMIN.f32 r3, r3^, u3.w1
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41062>
This makes it clear that both registers are read/written, and aligns
with DDK disassembly.
For example:
STORE.i128.istream.slot2.reconverge @r0:r1:r2:r3, r4^, offset:0
vs
STORE.i128.istream.slot2.reconverge @r0:r1:r2:r3, [r4^:r5^], offset:0
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41062>
The previous fix was incomplete because if the same graphics pipeline
and the same PS epilog are rebind after vkCmdExecuteCommands(), the PS
epilog state wouldn't be re-emitted, and it will use a wrong VA (in case
both fragment shader user SGPRs aren't similar either).
Resetting the PS epilog to NULL in the primary should prevent any
issues, but this tracking still need to be improved because it caused
two issues recently.
Fixes: 1a00587c44 ("radv: fix a GPU hang with PS epilogs and secondary command buffers")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15176
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41056>
With SPV_KHR_constant_data, it's allowed to specialize array of
constants.
RustiCL changes are from Karol Herbst <kherbst@redhat.com>.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41046>
This patch enables dynamic stack ID control on Xe3+.
Programmed values are the recommended settings from the Bspec.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41066>
When lowering cf we go out of SSA which translates phis into reg
intrinsics. However when converting them back to SSA, initially single
source phis now have an undef source leading to increased register
pressure on the NAK side. This also hinders copy propagation as it's not
designed to handle sources through phis yet.
Totals from 50621 (4.17% of 1212873) affected shaders:
CodeSize: 1605273744 -> 1621029728 (+0.98%); split: -0.34%, +1.32%
Number of GPRs: 4673586 -> 4067935 (-12.96%); split: -12.97%, +0.01%
SLM Size: 263428 -> 258176 (-1.99%)
Static cycle count: 2599838439 -> 2586392435 (-0.52%); split: -1.11%, +0.59%
Spills to memory: 23512 -> 15527 (-33.96%)
Fills from memory: 23512 -> 15527 (-33.96%)
Spills to reg: 64590 -> 57328 (-11.24%); split: -13.83%, +2.58%
Fills from reg: 55559 -> 44319 (-20.23%); split: -22.66%, +2.42%
Max warps/SM: 1189396 -> 1347600 (+13.30%)
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41042>
For DG2 (Bspec 47937) has the same programming note as of Xe2+,
"When this bit is set in the header, Trace Ray Message behaves like a
Ray Query. This message requires a write-back message indicating
RayQuery for all valid Rays (SIMD lanes) have completed."
So this patch is just passing a write back destination register when we
have ray query message.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41039>