This is a generic tool to convert OpenCL C to SPIR-V.
In the future, this will be replaced by `clang` directly using the LLVM SPIR-V
backend, but for now we need a tool in Mesa to provide this functionality with
older LLVM versions.
The important parts are that:
1. It does not depend on NIR or any real platform details. An older mesa_clc
from a previous Mesa version can generally be used to build a newer Mesa to
ease cross-OS builds.
2. Its output can be consumed without any LLVM dependence, which will untangle
the LLVM mess we have now.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31923>
no more nir_register
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Reviewed-by: Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31892>
This patterns were all found in the AGX quads tessellator, a medium-sized OpenCL
kernel. LLVM generates a lot of garbage around booleans which we need to chew
through. Though there's nothing AGX or really OpenCL specific here, so some of
this could help graphics shaders too.
Together, their effect is significant for that kernel instr count & occupancy:
before: 2966 inst, 2310 alu, 2310 fscib, 1216 ic, 23148 bytes, 239 regs, 384 threads
after: 2848 inst, 2246 alu, 2246 fscib, 1000 ic, 22260 bytes, 231 regs, 448 threads
No significant changes on GL shaderdb (a single godot shader regressed 1
instruction, 1344->1345).
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31892>
For instance, this issue is triggered on redeonsi with
"piglit/bin/shader_runner tests/spec/glsl-1.50/linker/interface-blocks-multiple-vs-member-count-mismatch.shader_test -auto -fbo":
Indirect leak of 176 byte(s) in 1 object(s) allocated from:
#0 0x7f894b5cd7ef in __interceptor_malloc (/usr/lib64/libasan.so.6+0xb17ef)
#1 0x7f894183aebf in ralloc_size ../src/util/ralloc.c:118
#2 0x7f894183b36e in rzalloc_size ../src/util/ralloc.c:152
#3 0x7f894183b36e in rzalloc_array_size ../src/util/ralloc.c:232
#4 0x7f894182da67 in _mesa_hash_table_init ../src/util/hash_table.c:163
#5 0x7f894182da67 in _mesa_hash_table_create ../src/util/hash_table.c:186
#6 0x7f894169af03 in gl_nir_validate_intrastage_interface_blocks ../src/compiler/glsl/gl_nir_link_interface_blocks.c:533
#7 0x7f89414464a4 in link_intrastage_shaders ../src/compiler/glsl/gl_nir_linker.c:2750
#8 0x7f894144bad2 in gl_nir_link_glsl ../src/compiler/glsl/gl_nir_linker.c:3785
#9 0x7f894128977e in st_link_glsl_to_nir ../src/mesa/state_tracker/st_glsl_to_nir.cpp:515
#10 0x7f894128977e in st_link_shader ../src/mesa/state_tracker/st_glsl_to_nir.cpp:1008
#11 0x7f894113c7b5 in link_program ../src/mesa/main/shaderapi.c:1317
#12 0x7f894113c7b5 in link_program_error ../src/mesa/main/shaderapi.c:1426
#13 0x7f8940afb1bb in _mesa_unmarshal_LinkProgram src/mapi/glapi/gen/marshal_generated2.c:1627
#14 0x7f894063319b in glthread_unmarshal_batch ../src/mesa/main/glthread.c:141
#15 0x7f894184e658 in util_queue_thread_func ../src/util/u_queue.c:294
#16 0x7f89418d220a in impl_thrd_routine ../src/c11/impl/threads_posix.c:67
#17 0x7f894a66a7c3 (/lib64/libc.so.6+0x867c3)
...
SUMMARY: AddressSanitizer: 1392 byte(s) leaked in 11 allocation(s).
Fixes: ffbd763586 ("glsl: add gl_nir_validate_intrastage_interface_blocks()")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31871>
We shouldn't assume the bindings are sparse when we allocate an array
indexed on the binding. See, for example:
dEQP-GLES31.functional.program_interface_query.buffer_variable.random.55
Fixes: 2e833b16bc ("nir/lower_amul: Use num_ubos/ssbos instead of recomputing it.")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31611>
This fixes:
KHR-GLES32.core.tessellation_shader.tessellation_shader_tessellation.max_in_out_attributes
Without this change the assert in gather_output is hit:
assert(!nir_src_is_const(offset) || nir_src_as_uint(offset) == 0)
Because nir_opt_algebraic determines that some ssa values are constant,
but the nir_io_add_const_offset_to_base wasn't run afterwards.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31684>
When offset=0, the pass was a no-op but was setting the progress
flag which could cause infinite loops when this pass is going
to be added to gl_nir_opts.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31684>
Since we check for loop-invariance, we don't have to unconditionally
flag LCSSA phis as divergent in presence of divergent breaks.
This ensures consistency, with or without LCSSA form.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>
This does shader analysis that is more niche than regular shader info.
It's planned to be used by nir_restructure_tcs_flow as discussed here:
https://gitlab.freedesktop.org/mesa/mesa/-/issues/11910
It's also useful for driver-specific passes.
The code for gathering "all_invocations_define_tess_levels" is copied
from radeonsi. The rest is new.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31447>
We need both the same-invocation usage mask and cross-invocation usage
mask. The AMD reason is below.
Cross-invocation TCS input access doesn't prevent the same-invocation
fast path in AMD hw because it's just a different way to load the same
data, and we want to use both paths for the same TCS input based on
the load instruction. The fast path can't be used for indirect access,
which is gathered separately for same-invocation access.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31645>
Switch NAK to it.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31706>
Helps for the case when max_distance is set to ~0, where the pass would now
only create groups of two loads together due to overflow. Found while
experimenting with this pass on r300, however the only driver currently
affected is i915.
With i915 this change gains around 20 shaders in my small shader-db
(most notably some GLMark2, Unigine Tropics, Tesseract, Amnesia) at
the expense of increased register pressure in few other cases.
I'm assuming this is a good deal for such old HW, and this seems like what
was intended when the pass was introduced to i915, but anyway this
could be tweaked further driver side with a more optimized max_distance
value. Only shader-db tested.
Relevant i915 shader-db stats (lpt):
total tex_indirect in shared programs: 1529 -> 1493 (-2.35%)
tex_indirect in affected programs: 96 -> 60 (-37.50%)
helped: 29
HURT: 2
total temps in shared programs: 3015 -> 3200 (6.14%)
temps in affected programs: 465 -> 650 (39.78%)
helped: 1
HURT: 91
GAINED: 20
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: GKraats <vd.kraats@hccnet.nl>
Fixes: 33b4eb149e
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31529>
lower_scan_reduce only worked when ballot_components equals one. This
commit adds support for arbitrary ballot_components.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31587>
This functionality will become more complex in the next commit so
separate it into a helper function.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31587>
build_subgroup_mask and build_ballot_imm_ishl will be needed by other
functions higher-up the file.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31587>
This intrinsic was initially dedicated to mesh/task shaders, but the
mechanism it exposes also exists in the compute shaders on Gfx12.5+.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31508>