Instead of just storing a pre-computed min_sample_shading, store both
sample_shading_enable and min_sample_shading and then compute the final
min_sample_shading value when we emit state.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31585>
Annoyingly, 2x2 with 2 passes uses the masks 0xa and 0x5, not 0x3 and
0xc as the current code assumes. When we enable 4x2_D3D, that one gets
even more wonky. This reworks things so that we have two copies of root
descriptor sample_masks array in scratch states, one for 2pass and one
for 4pass, that get copied into the push constants as needed. This
should handle the needs of both 2x2 and 4x2_D3D as well as 4x4 when the
time comes.
Fixes: 6a84d5439d ("nvk: Move the ANTI_ALIAS_CONTROL logic to the MME")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31585>
1. nightly takes a long, the extra few minutes for linking don't matter
2. nightly will run faster, since where it's CPU dependent it's at least +5% perf
3. it may reveal some painful areas of common code or driver.
4. for some jobs (not enabled yet, it generates ugly errors, disable
there
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Signed-off-by: David Heidelberg <david@ixit.cz>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28684>
If the backend does not implement this too, or some other future transform
modifiess the phi so that this isn't the case (replace the phi with a
bcsel or replace undef with zero), then it will not actually be uniform.
This keeps it enabled to some degree for RADV/ACO.
fossil-db (navi31):
Totals from 76 (0.10% of 79395) affected shaders:
Instrs: 195008 -> 195282 (+0.14%)
CodeSize: 1012592 -> 1015884 (+0.33%)
Latency: 3892826 -> 3898843 (+0.15%); split: -0.00%, +0.15%
InvThroughput: 460681 -> 460964 (+0.06%)
Copies: 13508 -> 13516 (+0.06%)
Branches: 5244 -> 5412 (+3.20%)
PreVGPRs: 5092 -> 5096 (+0.08%)
VALU: 116177 -> 116197 (+0.02%)
SALU: 23449 -> 23785 (+1.43%)
fossil-db (navi21):
Totals from 76 (0.10% of 79395) affected shaders:
Instrs: 164471 -> 164981 (+0.31%)
CodeSize: 883988 -> 888420 (+0.50%)
Latency: 4074287 -> 4082043 (+0.19%)
InvThroughput: 783783 -> 784276 (+0.06%); split: -0.00%, +0.06%
Branches: 5262 -> 5430 (+3.19%)
PreVGPRs: 5100 -> 5104 (+0.08%)
VALU: 116375 -> 116381 (+0.01%)
SALU: 23589 -> 23925 (+1.42%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30211>
The IB2 packet is only supported on the graphics queue. To execute DGC
IB on compute, the previous solution was to submit it separately
without any chaining. Though this solution was incomplete because it's
easy to reach the maximum number of IBs per submit when there is a lot
of ExecuteIndirect() calls.
To fix that, the proposed solution is to implement DGC IB chaining when
it's executed on the compute only. The idea is to add a trailer that is
added at the beginning of the DGC IB (to know the offset). This trailer
is used to chain back back the DGC IB to a normal CS, it's patched at
execution time. Patching is fine because it's not allowed to execute
the same DGC IB concurrently and the entire solution relies on that.
When the DGC IB is executed on graphics, the trailer isn't patched and
it only contains NOPs padding. Performance should be mostly similar.
This fixes
dEQP-VK.dgc.nv.compute.misc.execute_many_*_primary_cmd_compute_queue.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30809>
Names used in libconfig's configuration files only allow alphanumerics,
underscores, dashes and asterisks. Freedreno device names, used as names
in fdperf.cfg, can also contain other characters, currently spaces and
plus characters. Not accounting for those makes it impossible to store
fdperf configuration across separate runs.
Once the Freedreno device name is retrieved, it's now sanitized for use
in fdperf.cfg. Unsupported characters are converted to underscores.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31577>
Bring it up to parity with the LAVA and bare-metal containers by
stripping things we don't need at runtime. There is a lot of stuff we
don't need in container images we only use to execute tests, including
but not limited to the system Mesa which can only cause problems. Call
the same strip-rootfs we already run for LAVA to make sure that this
doesn't happen, as well as slimming down the container image.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31281>
We need to set $LD_LIBRARY_PATH so we can find GL/Vulkan at all, and
$LIBGL_DRIVERS_PATH so Xvfb will pick up the correct DRI modules.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31281>
We need to set $LD_LIBRARY_PATH so we can find GL at all, and
$LIBGL_DRIVERS_PATH so Xvfb will pick up the correct DRI modules.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31281>
We need to set $LIBGL_DRIVERS_PATH so Xvfb will pick up the correct DRI
modules. We were setting $LD_LIBRARY_PATH, but llvmpipe was getting it
wrong, so Weston was picking up the host GLES, which we're about to no
longer install.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31281>