ishl can wrap, amul cannot. so we need amul in the backend, or otherwise we
would need to introduce an ashl opcode instead. that doesn't seem better.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>
Some drivers generally need int64 lowered, but prefer to do this lowering
themselves late, to have a chance to optimize targeted int64 patterns before
lowering the rest. This isn't currently possible since nir_lower_int64 takes no
options except what's const* in the shader, and frontends call nir_lower_int64
before passing the shader off to the driver. Add an option to defer int64
lowering. This is a bit ugly but the alternative is replumbing nir_lower_int64's
option handling cross-tree and no-thank-you-not-right-now.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>
This only worked when "a" was 16-bit because a pattern above replaced the
shift.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31762>
It only supported scalar intrinsics because it was written before
nir_opt_vectorize_io existed. The introduction of nir_opt_vectorize_io
exposes this issue. The direct path has been tested. The indirect path
hasn't. That's fine because if we see a CLIP_DIST failure with indirect
in the future, this pass is likely the cause.
This is a prerequisite for enabling nir_opt_varyings for all gallium
drivers.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31994>
to improve CSE of load_barycentric_* and IO vectorization.
This is only for load_interpolated_input, which can never be FLAT.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31968>
This is a generic tool to convert OpenCL C to SPIR-V.
In the future, this will be replaced by `clang` directly using the LLVM SPIR-V
backend, but for now we need a tool in Mesa to provide this functionality with
older LLVM versions.
The important parts are that:
1. It does not depend on NIR or any real platform details. An older mesa_clc
from a previous Mesa version can generally be used to build a newer Mesa to
ease cross-OS builds.
2. Its output can be consumed without any LLVM dependence, which will untangle
the LLVM mess we have now.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31923>
no more nir_register
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Reviewed-by: Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31892>
This patterns were all found in the AGX quads tessellator, a medium-sized OpenCL
kernel. LLVM generates a lot of garbage around booleans which we need to chew
through. Though there's nothing AGX or really OpenCL specific here, so some of
this could help graphics shaders too.
Together, their effect is significant for that kernel instr count & occupancy:
before: 2966 inst, 2310 alu, 2310 fscib, 1216 ic, 23148 bytes, 239 regs, 384 threads
after: 2848 inst, 2246 alu, 2246 fscib, 1000 ic, 22260 bytes, 231 regs, 448 threads
No significant changes on GL shaderdb (a single godot shader regressed 1
instruction, 1344->1345).
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31892>
For instance, this issue is triggered on redeonsi with
"piglit/bin/shader_runner tests/spec/glsl-1.50/linker/interface-blocks-multiple-vs-member-count-mismatch.shader_test -auto -fbo":
Indirect leak of 176 byte(s) in 1 object(s) allocated from:
#0 0x7f894b5cd7ef in __interceptor_malloc (/usr/lib64/libasan.so.6+0xb17ef)
#1 0x7f894183aebf in ralloc_size ../src/util/ralloc.c:118
#2 0x7f894183b36e in rzalloc_size ../src/util/ralloc.c:152
#3 0x7f894183b36e in rzalloc_array_size ../src/util/ralloc.c:232
#4 0x7f894182da67 in _mesa_hash_table_init ../src/util/hash_table.c:163
#5 0x7f894182da67 in _mesa_hash_table_create ../src/util/hash_table.c:186
#6 0x7f894169af03 in gl_nir_validate_intrastage_interface_blocks ../src/compiler/glsl/gl_nir_link_interface_blocks.c:533
#7 0x7f89414464a4 in link_intrastage_shaders ../src/compiler/glsl/gl_nir_linker.c:2750
#8 0x7f894144bad2 in gl_nir_link_glsl ../src/compiler/glsl/gl_nir_linker.c:3785
#9 0x7f894128977e in st_link_glsl_to_nir ../src/mesa/state_tracker/st_glsl_to_nir.cpp:515
#10 0x7f894128977e in st_link_shader ../src/mesa/state_tracker/st_glsl_to_nir.cpp:1008
#11 0x7f894113c7b5 in link_program ../src/mesa/main/shaderapi.c:1317
#12 0x7f894113c7b5 in link_program_error ../src/mesa/main/shaderapi.c:1426
#13 0x7f8940afb1bb in _mesa_unmarshal_LinkProgram src/mapi/glapi/gen/marshal_generated2.c:1627
#14 0x7f894063319b in glthread_unmarshal_batch ../src/mesa/main/glthread.c:141
#15 0x7f894184e658 in util_queue_thread_func ../src/util/u_queue.c:294
#16 0x7f89418d220a in impl_thrd_routine ../src/c11/impl/threads_posix.c:67
#17 0x7f894a66a7c3 (/lib64/libc.so.6+0x867c3)
...
SUMMARY: AddressSanitizer: 1392 byte(s) leaked in 11 allocation(s).
Fixes: ffbd763586 ("glsl: add gl_nir_validate_intrastage_interface_blocks()")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31871>
We shouldn't assume the bindings are sparse when we allocate an array
indexed on the binding. See, for example:
dEQP-GLES31.functional.program_interface_query.buffer_variable.random.55
Fixes: 2e833b16bc ("nir/lower_amul: Use num_ubos/ssbos instead of recomputing it.")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31611>
This fixes:
KHR-GLES32.core.tessellation_shader.tessellation_shader_tessellation.max_in_out_attributes
Without this change the assert in gather_output is hit:
assert(!nir_src_is_const(offset) || nir_src_as_uint(offset) == 0)
Because nir_opt_algebraic determines that some ssa values are constant,
but the nir_io_add_const_offset_to_base wasn't run afterwards.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31684>
When offset=0, the pass was a no-op but was setting the progress
flag which could cause infinite loops when this pass is going
to be added to gl_nir_opts.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31684>
Since we check for loop-invariance, we don't have to unconditionally
flag LCSSA phis as divergent in presence of divergent breaks.
This ensures consistency, with or without LCSSA form.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>