We were using this to remove unreachable instructions following
jumps. The previous patch allowed glsl to nir to handle these
instructions so this call is no longer needed.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27288>
Unlike in GLSL IR it is illegal to add an instruction to a block
following a jump in NIR. Here we add code to the glsl_to_ir pass
to remove any such instructions before they are processed i.e.
we remove them as soon as we process the jumps.
Handling this in glsl to nir allows us to avoid depending on the
lower_jumps() pass being called directly before glsl to nir when
it otherwise doesn't need to be called an additional time.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27288>
visit_exec_list() has always called foreach_in_list_safe() here
were rename that version to visit_exec_list_safe() and create
a version that calls the non-safe foreach call.
There are only 2 users of visit_exec_list() we change lower_jumps
to use the renamed version and leave glsl_to_nir() to use the
non-safe version as it never deletes the current instruction and
in the following patch we will add code that may delete the next
instruction meaning the safe version would be unsafe to use.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27288>
needed by GLSL compiler optimizations that have unlowered sysvals
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28049>
Encoding of cmat_desc is overwriting the base_type with the type of the
elements of the matrix.
Fixes: 2d0f4f2c17 ("compiler/types: Add support for Cooperative Matrix types")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28086>
This intrinsic helps to read the W coordinate stored in the QPU register
when initializing the input data for the fragment shaders.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28072>
If the app requests a swizzle on the shadow sampler which doesn't just
return the red channel or literal 0s/1s, we'll crash attempting to build
the result vector. Use something that's probably valid.
Cc: mesa-stable
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28001>
These don't seem useful, since they're already done in the early optimizations.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27335>
These depend on the subgroup invocation id, so they are divergent.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Fixes: df86c5ffb3 ("nir: add divergence analysis pass.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27962>
Commit 90e364edb0 contained a typo in the glsl_dvec4_type() helper,
instead returning a glsl_ivec4_type. As an ivec4 is 2x smaller than
a dvec4, this also broke piglit sanity on crocus/hsw.
This also fixes the dvec2 helper, though it has not been specifically
tested anywhere.
Fixes: 90e364edb0 ("compiler/types: Add a few more helpers to get builtin types")
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27917>
Use the built-in function from ir_instruction to make sure that we are actually
not casting to anther type by mistake.
Signed-off-by: Corentin Noël <corentin.noel@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27114>
These will be used in future to do more validation on functions as
the glsl nir linker is expanded. The first use is in the following
patch.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27841>
This adds support for array and struct function returns in the glsl
to nir pass allowing us to avoid extra calls to the glsl IR
optimisation loop.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27108>
This adds support for array and struct function params in the glsl
to nir pass allowing us to avoid extra calls to the glsl IR
optimisation loop.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27108>
Here we remove the special handling for input params that was hard
to work with and unite it with the output and inout params.
Here a mediump test needs to be updated to what is a more expected
outcome anyway.
We also need to update the code that inserts software f64 to the
new way input params are handled.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27108>
No shader-db changes on any Intel platform.
fossil-db:
All Ice Lake and newer platforms had similar results. (Ice Lake)
Totals:
Instrs: 165513303 -> 165511820 (-0.00%)
Cycles: 15125314947 -> 15125211500 (-0.00%); split: -0.00%, +0.00%
Totals from 82 (0.01% of 656120) affected shaders:
Instrs: 544627 -> 543144 (-0.27%)
Cycles: 22616493 -> 22513046 (-0.46%); split: -0.46%, +0.00%
No fossil-db changes on Gfx9.
Suggested-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27044>
This adds optimizations for iadd, fadd, and ixor with reduce,
inclusive scan, and exclusive scan.
NOTE: The fadd and ixor optimizations had no shader-db or fossil-db
changes on any Intel platform.
NOTE 2: This change "fixes" arb_compute_variable_group_size-local-size
and base-local-size.shader_test on DG2 and MTL. This is just changing
the code path taken to not use whatever path was not working properly
before.
This is a subset of the things optimized by ACO. See also
https://gitlab.freedesktop.org/mesa/mesa/-/issues/3731#note_682802. The
min, max, iand, and ior exclusive_scan optimizations are not
implemented.
Broadwell on shader-db is not happy. I have not investigated.
v2: Silence some warnings about discarding const.
v3: Rename mbcnt to count_active_invocations. Add a big comment
explaining the differences between the two paths. Suggested by Rhys.
shader-db:
All Gfx9 and newer platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 20300384 -> 20299545 (<.01%)
instructions in affected programs: 19167 -> 18328 (-4.38%)
helped: 35 / HURT: 0
total cycles in shared programs: 842809750 -> 842766381 (<.01%)
cycles in affected programs: 2160249 -> 2116880 (-2.01%)
helped: 33 / HURT: 2
total spills in shared programs: 4632 -> 4626 (-0.13%)
spills in affected programs: 206 -> 200 (-2.91%)
helped: 3 / HURT: 0
total fills in shared programs: 5594 -> 5581 (-0.23%)
fills in affected programs: 664 -> 651 (-1.96%)
helped: 3 / HURT: 1
fossil-db results:
All Intel platforms had similar results. (Ice Lake shown)
Totals:
Instrs: 165551893 -> 165513303 (-0.02%)
Cycles: 15132539132 -> 15125314947 (-0.05%); split: -0.05%, +0.00%
Spill count: 45258 -> 45204 (-0.12%)
Fill count: 74286 -> 74157 (-0.17%)
Scratch Memory Size: 2467840 -> 2451456 (-0.66%)
Totals from 712 (0.11% of 656120) affected shaders:
Instrs: 598931 -> 560341 (-6.44%)
Cycles: 184650167 -> 177425982 (-3.91%); split: -3.95%, +0.04%
Spill count: 983 -> 929 (-5.49%)
Fill count: 2274 -> 2145 (-5.67%)
Scratch Memory Size: 52224 -> 35840 (-31.37%)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27044>
This is divergent because it specifically loads sequential values into
successive SIMD lanes.
No shader-db or fossil-db changes on any Intel platform.
Fixes: 9f44a26462 ("nir/divergence: handle load_global_block_intel")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27044>
This allow us to invoke the quad helper.
v2: (Georg)
- Add check for is_gather_implicit_lod
Fixes: 48158636bf ("nir: add is_gather_implicit_lod")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27447>