If there are no helper invocations required during the
execution of the shader, we can assume that there also
are no helper invocations active.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9058>
load_helper_invocation is an Input Builtin, for which the
value should not change during the execution of a shader.
This new pass inserts an is_helper intrinsic before any
demote() instruction and re-uses its value.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9058>
We already throw out any variables which may have a complex use so we
just need to make sure that our mode checks don't assert if we have a
deref which may_be but not must_be nir_var_function_temp.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9068>
We can't move the shuffle to a new block so this only works if the
shuffle and the bcsel are in the same block. Fortunately, in the
motivating case, this is true.
Also, we have to be careful around discard. We could try really hard to
just avoid moving them past discard but we choose to simply bail if we
see a discard instead.
Fixes: 4ff4d4e569 "nir/opt_intrinsic: Optimize bcsel(b, shuffle..."
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9068>
Previously, when we had a prototype-only function in SPIR-V, we would
compile it just fine and the function would have an impl that did
nothing. This commit changes that so that the nir_function::impl is
NULL to indicate a prototype-only function.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9069>
There are a bunch of cases where we can pretty quickly determine that
the high bits don't matter. In these cases, delete the casts.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8872>
There's reuse of values depending on the stage, so a function that
just takes the value might produce invalid results. All the codebase
was already changed to use the gl_varying_slot_name_for_stage()
instead.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8998>
Facilites the gl_SamplePosition lowering on Bifrost, where the sample
positions are accessed directly in a packed in-memory format.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8774>
Useful for determining whether certain optimizations are legal for a
compute shader (e.g. optimizing workgroup size in the driver).
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6312>
asan on llvmpipe with piglit tests/spec/arb_gl_spirv/execution/ssbo/array-indirect.shader_test
reported.
=================================================================
==3288325==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 48 byte(s) in 1 object(s) allocated from:
#0 0x7f5b2d6513cf in __interceptor_malloc (/lib64/libasan.so.6+0xab3cf)
#1 0x7f5b2a1ae810 in ralloc_size ../src/util/ralloc.c:133
#2 0x7f5b2a1ae7e1 in ralloc_context ../src/util/ralloc.c:120
#3 0x7f5b2b210177 in gl_nir_link_uniform_blocks ../src/compiler/glsl/gl_nir_link_uniform_blocks.c:585
#4 0x7f5b2af7f52d in gl_nir_link_spirv ../src/compiler/glsl/gl_nir_linker.c:614
#5 0x7f5b2a3b76fa in st_link_nir ../src/mesa/state_tracker/st_glsl_to_nir.cpp:765
#6 0x7f5b2a3ace7b in st_link_shader ../src/mesa/state_tracker/st_glsl_to_ir.cpp:65
#7 0x7f5b2a471165 in _mesa_glsl_link_shader ../src/mesa/program/ir_to_mesa.cpp:3122
#8 0x7f5b2a97a6d8 in link_program ../src/mesa/main/shaderapi.c:1311
#9 0x7f5b2a97a6d8 in link_program_error ../src/mesa/main/shaderapi.c:1419
#10 0x7f5b2a97df45 in _mesa_LinkProgram ../src/mesa/main/shaderapi.c:1911
#11 0x7f5b299b59e5 in stub_glLinkProgram /mnt/devel/gl/piglit/tests/util/piglit-dispatch-gen.c:33956
#12 0x40a71a in link_and_use_shaders /mnt/devel/gl/piglit/tests/shaders/shader_runner.c:1604
#13 0x415722 in init_test /mnt/devel/gl/piglit/tests/shaders/shader_runner.c:5225
#14 0x4164ce in piglit_init /mnt/devel/gl/piglit/tests/shaders/shader_runner.c:5597
#15 0x7f5b29a214e9 in run_test /mnt/devel/gl/piglit/tests/util/piglit-framework-gl/piglit_winsys_framework.c:73
#16 0x7f5b29a103fe in piglit_gl_test_run /mnt/devel/gl/piglit/tests/util/piglit-framework-gl.c:229
#17 0x407847 in main /mnt/devel/gl/piglit/tests/shaders/shader_runner.c:72
#18 0x7f5b2928f1e1 in __libc_start_main (/lib64/libc.so.6+0x281e1)
SUMMARY: AddressSanitizer: 48 byte(s) leaked in 1 allocation(s).
Fixes: 57239192 ("nir/linker: add gl_nir_link_uniform_blocks.c")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8974>
The base mask previously used was 0xffffffff. This is not correct (but
should still work) for 16-bit and 8-bit values, but it means the high
32-bits of 64-bit values will get chopped off.
Instead of just restricting the pattern to 32-bits (as was done before
00b28a50b2), this extends the optimization in two ways:
1. Make it correct for other bit sizes.
2. Make it work for arbitrary shift counts.
This has the added benefit of reducing the number of patterns actually
added (7 previously, 4 now).
The "Reassociate for improved CSE" part is just reverted to its
pre-00b28a50b2c behavior. I doubt that pattern is likely to have much
impact outside 32-bits.
This change fixes the piglit tests
tests/spec/arb_gpu_shader_int64/fs-shl-of-shr-int64.shader_test and
tests/spec/arb_gpu_shader_int64/fs-iand-of-iadd-int64.shader_test.
All of the shaders helped in shader-db are vertex shaders on platforms
with vector-oriented vertex processing. The shaders contain ((x >> 16)
<< 16). These platforms set lower_extract_word, so the optimization
that transforms (x >> 16) to extract_u16 doesn't trigger. With only ~60
shaders involved, I didn't bother trying to add extract_XYZ versions of
these patterns to try to get those cases.
Fixes: 00b28a50b2 ("nir/algebraic: trivially enable existing 32-bit patterns for all bit sizes")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Haswell and earlier Intel GPUs had simlar results. (Haswell shown)
total instructions in shared programs: 16397554 -> 16397496 (<.01%)
instructions in affected programs: 7961 -> 7903 (-0.73%)
helped: 58
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.36% max: 1.89% x̄: 0.99% x̃: 0.78%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -1.13% -0.85%
Instructions are helped.
total cycles in shared programs: 1035483770 -> 1035483504 (<.01%)
cycles in affected programs: 75922 -> 75656 (-0.35%)
helped: 44
HURT: 2
helped stats (abs) min: 2 max: 12 x̄: 6.14 x̃: 2
helped stats (rel) min: 0.05% max: 1.67% x̄: 0.87% x̃: 0.72%
HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel) min: 0.06% max: 0.06% x̄: 0.06% x̃: 0.06%
95% mean confidence interval for cycles value: -7.28 -4.29
95% mean confidence interval for cycles %-change: -1.03% -0.63%
Cycles are helped.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8852>
Newer versions of SPIR-V require that all the global variables used by
the entry point are declared (in contrast to only I/O in previous
versions), so there's no need to remove dead variables or keep track
of the indirectly used variables.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8456>
OpEntryPoint declares the list of variables in Input and Output
storage classes that are used. Use that information to skip creating
other variables from such storage classes that are unused by the entry
point.
After that change, is not necessary to use remove dead variables for
those types of variables; and because of that is also not necessary to
lower initalizers for output variables.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8456>
Fail when parsing Initializers used in Variables with Storage Classes
that doesn't support it.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8820>
This will be used to implement
VK_KHR_zero_initialize_workgroup_memory.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8708>
Basically every pass in NIR uses nir_ssa_def_rewrite_uses which calls
nir_instr_rewrite_src which is fairly complex because it handles all
sorts of non-SSA cases. Since we already know a priori that every
source written by nir_ssa_def_rewrite_uses is SSA, we can check new_src
once at the top of the function and cut out all that complexity.
While we're at it, we expose a new SSA-only nir_ssa_def_rewrite_uses_ssa
helper which takes an SSA def which avoids the one SSA check. It's also
more convenient 90% of the time.
Compile time as tested by Rhys Perry <pendingchaos02@gmail.com>
Difference at 95.0% confidence
-797.166 +/- 418.649
-0.566174% +/- 0.296441%
(Student's t, pooled s = 325.459)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8790>
The issues fixed by the removal happen when a module has multiple
entry points and conflicting global variables. Neither conditions are
expected in a library.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8786>
Not only these are recalculated in nir_shader_gather_info, but
currently they are also counting all the images / textures in the
module instead of in the shader (entrypoint).
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8786>
It is currently a bitset on top of a uint64_t but there are already
more than 64 values. Change to use BITSET to cover all the
SYSTEM_VALUE_MAX bits.
Cc: mesa-stable
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8585>
All uses are passing variables of either nir_var_shader_in or
nir_var_shader_out modes. Note that currently there are more than 64
system values, so the uint64_t wouldn't be enough anyway.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8585>