By default crocus precompiles shaders, to avoid stuttering at screens,
caused by compiling shaders at the drawing phase.
Unfortunately at intel Gen 6 and higher the precompiled version of the
fragment shaders is not used and every fragment shader is compiled twice.
These double fragment shaders also are added to the memory cache
and disk cache.
This is caused by setting wrong values to variables at the key during
precompiling at routine crocus_create_fs_state() at src/gallium/drivers/crocus/crocus_program.c,
which differ from values at crocus_populate_fs_key() at src/gallium/drivers/crocus/crocus_state.c.
This commit solves 3 problems:
it adjusts the predicted value 'input_slots_valid' at Gen 6
it adjusts the predicted value 'ignore_sample_mask_out' at Gen 6 and higher
it predicts the value 'multisample_fbo' , which helps if samplemask is used
Cc: mesa-stable
Signed-off-by: GKraats <vd.kraats@hccnet.nl>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35605>
Format is set to ISL_FORMAT_UNSUPPORTED at anv_get_format_plane at src/intel/vulkan_hasvk/anv_formats.c,
because Ivy Bridge does not support enough 24 and 48-bits formats.
Problem solved by checking format after the call.
Signed-off-by: GKraats <vd.kraats@hccnet.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40237>
This covers the DX8/DX9 single-frame apitrace collection from
traces-db-private, and the job will appear for anyone in the group with
access to restricted traces. Like other restricted traces jobs, it's set
to allow-failure, so that regressions in the job from changes by
developers not in the group don't block merging by developers with access,
but hopefully the increased visibility lets us catch rendering bugs faster
or avoid merging them in the first place.
The actual runtime for all of our dx8/9 trace collection is about 2:30,
and the whole job is about 7:30.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40959>
The new tool has much better image diffing presentation (thanks to
Danilo's work on turnip's private trace CI), better performance, flake
checking within a single run, parallelized downloads along with replays,
and ability to cache downloaded files to improve runtime, and system
monitoring (for debugging OOM-related slowdowns).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40959>
The new tool has much better image diffing presentation (thanks to
Danilo's work on turnip's private trace CI), better performance, flake
checking within a single run, parallelized downloads along with replays,
and ability to cache downloaded files to improve runtime, and system
monitoring (for debugging OOM-related slowdowns).
./bin/update_traces_checksum.sh still updates based on the output of a CI
run, but you can also apply a patch file that the tool generates, if you
do offline runs using your traces.toml.
New traces being replayed, in less overall runtime (2 minutes instead of 3):
- minetest/minetest-high-v3.trace (new version, not the old flaky one)
- neverball/neverball-v2.trace
- ror/ror-default.trace
- supertuxkart/supertuxkart-mansion-egl-gles-v2.b.trace
- valve/counterstrike-v2.trace
- valve/portal-2-v2.trace
- xonotic/xonotic-keybench-high-v2.trace
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40959>
The build script is just copy and paste of deqp-runner's.
This will be used to replace piglit's trace replay (and I have plans for
better gitlab CI-based performance testing as well)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40959>
The wine prefix is dropped from build-vkd3d-proton, where it's not needed
(no remaining references in the tree). We do set up a /wineprefix (as a
more obvious name) in the wine installation, and include /usr/lib/*/wine
in test-vk container images and in a tarball uploaded as a LAVA rootfs
overlay.
With this, one should be able to run "wine" successfully.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40959>
This has improvements to snapshots for looping that I'll be using for the
new trace replay tool, and supports zstd trace compression (which we're
using in traces-db/traces-db-private now).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40959>
The hangover DXVK builds we want to use for arm64 CI hit this path, and we
have a perfectly reasonable fallback for handling this case (ignore the
sampler, as glslang should have done).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40959>
This helper might be used as by another instruction emission,
which itself might have set the saturate bit in the default
state. This might result in the SYNC being created already
with saturate bit set.
Since SYNC doesn't have saturate, clear that field
instead of sometimes having it set.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41005>
PANVK_DEBUG_HSR_PREPASS and PANVK_DEBUG_NO_EXTENDED_VA_RANGE have the
same value, meaning they both get toggled when one is.
This commit moves PANVK_DEBUG_HSR_PREPASS to the following value.
Fixes: 2d9be41706 ("panvk/v13: Support HSR Prepass")
Reviewed-by: John Anthony <john.anthony@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41106>
The disassembly file had a lot of inconsitencies in indentation, so
align on the standard IndentWidth: 3
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41062>
The current implementation prints FAU entries as 32-bit entries. While
this works, it does not align with the DDK.
Rather than treating FAU as a set of 32-bit entries, treat is as 64-bit
entries that can be split in two words.
This aligns with the DDK and has allows for differentiating 32-bit and
64-bit reads based on whether a word modifier is used.
Finally, add entry values to FAU printing to easily look up specific
reads.
For example:
Vertex FAU @ffd93950:
43000000 43000000
3F800000 43000000
43000000 00000000
C7000000 47000000
00000001 00000000
FMAX.f32 r3, r3^, u6
FMIN.f32 r3, r3^, u7
vs
Vertex FAU @ffd93950:
u0 43000000 43000000
u1 3F800000 43000000
u2 43000000 00000000
u3 C7000000 47000000
u4 00000001 00000000
FMAX.f32 r3, r3^, u3.w0
FMIN.f32 r3, r3^, u3.w1
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41062>
This makes it clear that both registers are read/written, and aligns
with DDK disassembly.
For example:
STORE.i128.istream.slot2.reconverge @r0:r1:r2:r3, r4^, offset:0
vs
STORE.i128.istream.slot2.reconverge @r0:r1:r2:r3, [r4^:r5^], offset:0
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41062>