Highlights:
- all shader stages and all input/output types are handled, including
inputs and outputs with multiple vertices
- the optimizations performed are: unused input/output removal, constant
and uniform propagation, output deduplication, inter-shader code motion,
and compaction
- constant and uniform propagation and output deduplication work even
if a shader contains multiple stores of the same output, e.g. in GS
- the same optimizations are also performed between output stores and
output loads (for TCS)
- FS inputs are packed agressively. Only flat, interp FP32, and interp
FP16 can't be in the same vec4. Also, if an output value is
non-divergent within a primitive, the corresponding FS input is
opportunistically promoted to flat.
The big comment at the beginning of nir_opt_varyings.c has a detailed
explanation, which is the same as:
https://gitlab.freedesktop.org/mesa/mesa/-/issues/8841
dEQP and GLCTS have incorrect tests that fail with this, see:
https://gitlab.freedesktop.org/mesa/mesa/-/issues/10361
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26819>
Some of those are no longer needed after moving the sin/cos input fixups
to finalize_nir, while others were are just some unneeded remnants from
nir_to_tgsi era.
Almost no shader-db change on RV530:
total instructions in shared programs: 128940 -> 128939 (<.01%)
instructions in affected programs: 154 -> 153 (-0.65%)
helped: 3
HURT: 2
total cycles in shared programs: 197402 -> 197401 (<.01%)
cycles in affected programs: 263 -> 262 (-0.38%)
helped: 3
HURT: 2
or RV370:
total instructions in shared programs: 83946 -> 83944 (<.01%)
instructions in affected programs: 32 -> 30 (-6.25%)
helped: 2
HURT: 0
total cycles in shared programs: 132829 -> 132827 (<.01%)
cycles in affected programs: 93 -> 91 (-2.15%)
helped: 2
HURT: 0
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28070>
before applying the input range normalization. This allows to move the pass
to finalize nir later without worrying we would apply the fixup twice and
also saves few instructions in wined3d shaders, where d3d9 already
guarantees the correct input range.
RV530 shader-db (and similarly for R4xx) improves few Anno1404 shaders:
total instructions in shared programs: 129040 -> 129022 (-0.01%)
instructions in affected programs: 310 -> 292 (-5.81%)
helped: 5
HURT: 0
no change on RV370
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28070>
Added code guards that isolate DRM code paths so platformss that do not have DRM libraries can compile.
MacOS does not have vf86drm library and headers.
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28130>
On systems where the xcb.h is not in the standard prefixed location,
meson needs to be told where to find it.
Adding the `dep_xcb` to the `libdri`build fixes this.
This issue happened on macOS when the brew package manager was not in the default location.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28130>
This patch allows MacOS to compile the dri[2] systems by switching from direct xf86drm.h to
the pre-existing util/libdrm.h.
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28130>
This patch disabled the call to pthread_condattr_setclock on MacOS/Apple platforms.
This funciton is missing from the the Xcode pthread implemetation.
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28130>
MacOSs 'sys/stat.h' version of 'stat' doe snot have the 'st_mtim' that is used on other systems.
The change allows MacOS to use 'st_mtime' without affecting the behaviour on any other platform.
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28130>
With pipelines, the topology class is known at creation time but with
ESO this needs to be re-emitted when the topology change and not only
when graphics shaders are emitted.
This fixes spec@nv_primitive_restart@primitive-restart-* with Zink
when shader object is enabled.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28179>
In 96e0d979a7, the restriction was dropped because we don't compile a
SIMD8 program on Xe2. This change moves it to run_fs() so the
restriction will be added when compiling SIMD16 on Xe2.
Fixes: 96e0d979a7 ("intel/fs: Check fs_visitor instance before using it")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28191>
For guest vram, there's already roundtrip to protect device memory alloc
ordering. This change adds the same protection for shmem used in below
scenarios and optimize to wait for new shmem only.
- reply shmem
- indirect upload shmem
- cmd stream shmem
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28147>
A roundtrip is to ensure a cmd via virtqueue happens on the renderer
side before a ring relies on it. Since venus is now with multi-ring, the
roundtrip submit and wait should belong to ring instead of instance, and
each ring owns its own roundtrip seqno to synchronize with virtqueue.
No behavior change.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28147>