This is similar in principle to the previous commit "intel/brw/xe3+:
brw_compile_fs() implementation for Xe3+." but applied to compute-like
shader stages. It changes the implementation of brw_compile_cs/task/mesh()
to reduce compile time and take advantage of wider dispatch modes more
aggressively than the original logic, since as of Xe3 SIMD32 builds
succeed without spills in most cases thanks to VRT.
The new "optimistic" SIMD selection logic starts with the SIMD width
that is potentially highest performance and only compiles additional
narrower variants if that fails (typically due to spilling), while the
old "pessimistic" logic did the opposite: It started with the
narrowest SIMD width and compiled additional variants with increasing
register pressure until one of them failed to compile.
In typical non-spilling cases where we formerly compiled SIMD16 and
SIMD32 variants of the same compute shader, this change will halve the
number of backend compilations required to build it.
XXX - Possibly don't do this in cases with variable workgroup size
until effect on runtime performance can be measured directly.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
v2: Don't do this for now in cases with variable workgroup size, still
compile every possible variant in such cases.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32664>
The position in the error array already indicate the SIMD in question,
so take off all the formatted printing from the errors -- which in some
cases were just not needed. We lose a little bit of extra context but
it is all easily derivable from the message and the SIMD.
This also will remove the overhead when SIMD selection is being used to
just to find the selected dispatch width -- at a point where the shaders
were already compiled -- and the errors are not used at all.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9849
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25336>
We still update the cs_prog_data, but don't rely on it for this state anymore.
This will allow use the SIMD selector with shaders that don't use cs_prog_data.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19601>
This is a preparation to decouple the storage of what SIMDs
compiled/spilled from the cs_prog_data. This will allow reuse
of SIMD selection code by Bindless Shaders.
And since we have a struct now, move the error array there so
reduce the boilerplate of the users.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19601>
Fix defect reported by Coverity Scan.
Uninitialized pointer field (UNINIT_CTOR)
uninit_member: Non-static class member error is not initialized
in this constructor nor in any functions that it calls.
Fixes: 7558340ebb ("intel/compiler: Add helpers to select SIMD for compute shaders")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13608>
Unless we are combining multiple workgroups in the same HW thread,
there's no advantage of using SIMD16 when SIMD8 already fits the
entire workgroup.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13249>
Variable workgroup size works by compiling as much SIMD variants as
possible and then selecting the right one during dispatch (when the
actual workgroup size is passed to us).
Instead of replicating the logic in a separate function, reuse the
same logic for regular SIMD selection. And move function for that
together with the remaining simd selection functions.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13249>
Clean up the logic and move it to functions that work with prog_data
attributes to select the right SIMD. This shouldn't change any
behavior compared to the original.
Having it extracted will allow reuse by Task/Mesh and make it easier
to write tests.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13249>