Introduces an opt pass that attempts to optimize
load_barycentric_at_{sample,offset} with simpler load_barycentric_*
equivalents where possible, and optionally lowers
load_barycentric_at_sample to load_barycentric_at_offset with a position
derived from the sample ID instead.
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37658>
When you're trying to figure out what shader some NIR pass broke, use
nir_shader_bisect_select() to decide between NIR pass behaviors, and then
nir_shader_bisect.py will help you automatically bisect down to which
source_blake3 is at fault. Once it's identified, it prints you a C call
you can use for selecting that shader specifically, which you can use for
continuing on in your debugging.
On a test I was looking at, this took 10 steps to bisect 134 shaders down
to the source_blake3 of the NIR shader in question.
This idea is heavily lifted from Job Noorman's ir3_shader_bisect.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37468>
This broke after divergence became metadata because the divergence
analysis pass does not support all instructions.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35069>
This was incorrect (it also lowered int64 reductions/scans), and the only
user can just use the general callback to precisely only lower what it wants.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37164>
On a fossil from the blender 4.5.0 vulkan backend, this improves compile
times in nak by about 17%. Compile time of other shaders improves by a
more modest 1.2%.
No stat changes on shader-db.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36184>
This is gl specific and a following fix will add more gl specific
params so here we move it to the st to avoid filling nir.h with
more junk.
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37037>
This is gl specific and a following fix will add more gl specific
params so here we move it to the st to avoid filling nir.h with
more junk.
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37037>
Wether need_nuw is used is currently decided in two different ways:
- globally through the allow_offset_wrap option;
- per intrinsic but hard-coded in opt_offsets.
Make this more flexible by creating a callback that is called per
intrinsic. This will allow backends to decide, on a per-intrinsic basis,
whether need_nuw is needed.
Note that the main use case for ir3 is to add support for opt_offsets
for global memory accesses. Other intrinsics don't need need_nuw but
global memory accesses do.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37114>
Right now it tries to place reg_write instructions as far up the
predecessor chain as possible. This is useful for a bunch of the passes
that call it since it ensures they don't get placed in dead blocks or in
single successors and things like that. But it screws up NAK's control
flow lowering so we need the option to turn it off and make the pass
place the reg_write instructions in the most obvious place possible.
Fixes: b013d54e4f ("nak/lower_cf: Flag phis as convergent when possible")
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36914>
This adds a generic lowering pass for coop mat flexible dimensions.
This should be suitable for all drivers that implement coop mat2 flexible dimensions
or even just lowering sw exposed sizes to hw sizes.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36544>
We can just place the set structures inside nir_block.
This reduces the number of ralloc calls by 6.7% when compiling Heaven
shaders with radeonsi+ACO using a release build (i.e. not including
nir_validate set allocations, which are also removed).
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>
This is only 1% of all ralloc calls of Unigine Heaven with the gallium noop
driver, but it's an easy one to get rid of.
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>
For intrinsics supporting offset_shift, dealing with their offset is a
bit tricky as we cannot simply add a byte offset to it anymore (which is
what most passes want to do). This commit adds some helpers to add byte
offsets (and adjusting offset_shift accordingly) so that individual
passes don't have to worry about this.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>
It's not only for GL, change to a generic name.
Use command:
find . -type f -not -path '*/.git/*' -exec sed -i 's/\bgl_shader_stage\b/mesa_shader_stage/g' {} +
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36569>
Store small names in a fixed-sized string in nir_variable.
GLSL IR does the same thing.
When compiling my shader-db with the gallium noop driver, it improves GLSL
compile times by 0.7% (much lower than anticipated).
For Unigine Heaven shaders:
- it eliminates 95.6% ralloc calls for nir_variable names
- the total number of ralloc calls is reduced by 11%
It also adds only 16B to nir_variable, while just the ralloc header
for the name would occupy 40B.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36538>
Setting variable names currently always uses ralloc, but the new
nir_variable_* helpers will mostly eliminate ralloc/malloc in a later
commit.
This just updates all places that touch nir_variable names to use the new
helpers.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36538>
no other users now.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36516>
a bunch of drivers have versions of this, might as well make a common one.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: John Anthony <john.anthony@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36516>
When I went to use opt_reassociate for tu, I was advised that you want to
do this loop to get the best results. If everyone needs it, let's make it
common code and explain what's going on.
In the process, also make it skip work appropriately when there's no
progress.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36342>