Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Fixes: 09705747d7 ("nir/algebraic: Reassociate fadd into fmul in DPH-like pattern")
This introduces two new lowering passes. One to lower VS to explicit
outputs using STLW and one to lower GS to load input using LDLW and
implement the GS specific functionality.
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
These intrinsics will let us do all the offset calculations in nir,
which is nicer to work with and lets nir_opt_algebraic eat it all up.
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
This allows us to make sure clipdist is emitted as a scalar array rather
than two vec4s. This matches SPIR-V semantics, and will be useful for
Zink.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
The check to see if we were dealing with a buffer block was
too late and only worked for named UBOs.
Fixes: f32b01ca43 "glsl/linker: remove ubo explicit binding handling"
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1900
Adds alternate versions of the atan builtin functions that use
ir_unop_atan and ir_binop_atan2 instead of inlining to the IR
implementation of the function. These alternatives are selected if the
IR is going to be consumed by NIR. In that case the IR ops will be
translated to the appropriate NIR op.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Adds ir_binop_atan2 and ir_unop_atan. When converting to NIR these are
expanded out using the appropriate builtin generator. If they are used
with anything else then it will just hit an assert.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Moves build_atan and build_atan2 into nir_builtin_builder. The goal is
to be able to use this from the GLSL translator too.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
And after discard-only loops. Otherwise we end up with dead code
which confuses nir_repair_ssa into adding a whole bunch of uses
of undefined. However, for derefs, we sometimes always expect to
get a variable instead of undefined.
Fixes dEQP-VK.graphicsfuzz.write-red-in-loop-nest on radv.
Fixes: c832820ce9 "nir/dead_cf: Repair SSA if the pass makes progress"
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1928
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
v2: always assert on the texture/sampler handle's num_components
v3: replicate the deref inside the loop
v4: remove a case of useless line wrapping
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Currently meson doesn't correctly handle passing compiled binaries to
scripts in tests. This patch looks to the future (0.53) when meson will
have this functionality, but also immediately it fixes these tests in
cross compiles by causing them to return 77, which meson interprets as
skip.
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
There are quite a few tests that require getopt, when using MSVC we need
to use the bundled version of getopt since there isn't a system version.
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
so that drivers don't have to call nir_strip manually.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Previously, this could have made the resource divergent in code like
that which is genereated by nir_lower_non_uniform_access.
Fixes: da8ed68a ('nir: replace nir_move_load_const() with nir_opt_sink()')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Previously, for code like:
loop {
loop {
a = load_ubo()
}
use(a)
}
adjust_block_for_loops() would return the block before the first loop.
Now we compute the range of allowed blocks and then walk the dominance
tree directly, guaranteeing directly that we always choose a block that
dominates all the uses and is dominated by the definition.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
These can appear after loop unrolling.
v2: stylistic changes
v2: replace state->mem_ctx with state->shader
v2: add bounds checking
v3: use nir_intrinsic_range() for bounds checking
v3: fix issue where partially out-of-bounds reads are replaced with undefs
v4: fix merge conflicts during rebase
v5: split into two commits
v6: set constant_data to NULL after freeing (fixes nir_sweep()/Iris)
v7: don't remove the constant data if there are no constant loads
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v6)
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
We only have the subgroup variant in NIR (equivalent to clockARB), so
only support that for now.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Working on the algebraic implementation, I was being driven nuts by my
editor not highlighting and handling indentation for the C code. It turns
out that it's basically not pass-specific code, and we can move it over to
the relevant .c file. Replaces 30KB of code with 34KB of data on my i965
build. No perf diff on shader-db (n=3)
Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
This lets us memoize range analysis work across instructions. Reduces
runtime of shader-db on Intel by -30.0288% +/- 2.1693% (n=3).
Fixes: 405de7ccb6 ("nir/range-analysis: Rudimentary value range analysis pass")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Having passes generate these is just making more work for copy
propagation (and thus probably calling more optimization passes)
later. Noticed while trying to debug nir_opt_algebraic()
top-to-bottom having O(n^2) behavior due to not finding new matches in
replacement code.
Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
This matches what we do for uses_sample_qualifier, and what we
do in ir_set_program_inouts.cpp as well.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
From EXT_demote_to_helper_invocation, implemented with the existing
nir_intrinsic_is_helper_invocation.
Such builtin is necessary when using `demote` because we can't
redefine the value of gl_HelperInvocation (since it is an input
variable).
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
When the EXT_demote_to_helper_invocation extension is enabled,
`demote` is treated as a keyword, and produces an ir_demote.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>