The immediate addition can easily be handled by nir_opt_offsets, which
will also take any driver limits into account.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>
In ir3, SSBO offsets are in units of the accessed type size so we want
to start using the new offset_shift index.
Even though the shift is implicit for the ir3 intrinsics, we use
nir_intrinsic_copy_const_indices when creating them so we need to make
sure our indices match the ones used by the generic intrinsics.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>
For intrinsics supporting offset_shift, dealing with their offset is a
bit tricky as we cannot simply add a byte offset to it anymore (which is
what most passes want to do). This commit adds some helpers to add byte
offsets (and adjusting offset_shift accordingly) so that individual
passes don't have to worry about this.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>
For load/store intrinsics that take an offset, this specifies the amount
the offset is shifted left to calculate the final offset:
offset = (offset_src + base) << offset_shift
This is useful for backends that have memory operations that use offset
units other than bytes (i.e., where the shift is implicit).
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>
Usually we can fold most ldc and ldcx into the instruction using it,
however there are a couple of cases where we can't, e.g. when there is an
indirect offset.
Moving the ldc(x) down to the consumer leads to increase value ranges for
uniform registers, but lowering them for normal registers.
Totals:
CodeSize: 914650304 -> 914469536 (-0.02%); split: -0.05%, +0.03%
Number of GPRs: 3879754 -> 3863818 (-0.41%); split: -0.42%, +0.01%
Static cycle count: 1073273107 -> 1073101189 (-0.02%); split: -0.09%, +0.08%
Spills to reg: 67219 -> 67707 (+0.73%); split: -0.10%, +0.83%
Fills from reg: 79733 -> 80456 (+0.91%); split: -0.10%, +1.01%
Max warps/SM: 3666036 -> 3672668 (+0.18%); split: +0.18%, -0.00%
Totals from 24235 (27.66% of 87622) affected shaders:
CodeSize: 444747392 -> 444566624 (-0.04%); split: -0.11%, +0.07%
Number of GPRs: 1360384 -> 1344448 (-1.17%); split: -1.20%, +0.03%
Static cycle count: 806310857 -> 806138939 (-0.02%); split: -0.12%, +0.10%
Spills to reg: 35826 -> 36314 (+1.36%); split: -0.19%, +1.55%
Fills from reg: 31863 -> 32586 (+2.27%); split: -0.26%, +2.53%
Max warps/SM: 911328 -> 917960 (+0.73%); split: +0.74%, -0.01%
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36536>
Now that we lower all load_per_vertex_input to
r600_load_per_vertex_input we can remove some dead code
and also change the intrinsic to use only one source value.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36488>
glsl needs to plumb this from the backend. we should clean up
nir_lower_subgroups to use this later but I don't have time to churn everything
right now.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36649>
It's not only for GL, change to a generic name.
Use command:
find . -type f -not -path '*/.git/*' -exec sed -i 's/\bgl_shader_stage\b/mesa_shader_stage/g' {} +
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36569>
gc_ctx uses a slab allocator. This reduces GLSL compile times by 1-3%
with the gallium noop driver.
This reduces the number of ralloc_size calls for Heaven shaders by 14.3%.
Note that gc_ctx also uses ralloc_size, so the reduction is a net change.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36538>
Store small names in a fixed-sized string in nir_variable.
GLSL IR does the same thing.
When compiling my shader-db with the gallium noop driver, it improves GLSL
compile times by 0.7% (much lower than anticipated).
For Unigine Heaven shaders:
- it eliminates 95.6% ralloc calls for nir_variable names
- the total number of ralloc calls is reduced by 11%
It also adds only 16B to nir_variable, while just the ralloc header
for the name would occupy 40B.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36538>
Setting variable names currently always uses ralloc, but the new
nir_variable_* helpers will mostly eliminate ralloc/malloc in a later
commit.
This just updates all places that touch nir_variable names to use the new
helpers.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36538>
Using bit_count on the result of ballot doesn't work for targets where
ballot's num_components > 1.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Fixes: d2e1e4442a ("ir3: enable nir_opt_uniform_subgroup")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35669>
Otherwise, the barrier would no longer affect the access.
nir_opt_dead_write_vars should be fine, since it's removing stores, not
moving them.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36080>
Some hardware (AGX, Imagination, Arm) really want to know the interpolation
qualifiers when compiling the vertex shader. Even though we need to handle this
dynamic for separate shaders, we can improve performance by linking.
nir_opt_varyings already has all the information to do this, so just do so.
Note this has to be done in common code for Gallium, which links varyings within
the GLSL linker but then presents the linked programs as separate shader
objects. This models that nicely, allowing Gallium drivers to optimize without
weird sidebands.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36501>
we'll want this to be able to link interpolation qualifiers in a simple way with
nir_opt_varyings. add the metadata for it and the FS gathering pass.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36501>
the info is all messed up so we need to do this right after. merge this
code.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36501>
no other users now.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36516>
a bunch of drivers have versions of this, might as well make a common one.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: John Anthony <john.anthony@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36516>