Intel hardware does a form of multisample compression that involves an
auxilary surface called the MCS. When an MCS is in use, you have to first
sample from the MCS with a special opcode and then pass the result of that
operation into the next sample instrucion. Normally, we just do this
ourselves in the back-end, but we want to expose that functionality to NIR
so that we can use MCS values directly in NIR-based blorp.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This is similar to nir_channel except that it lets you grab more than one
channel by providing a mask.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
There's no reason for having a macro *and* a python generator. We can
easily just do the whole thing in python. This has the advantage that we
are no longer definining ALU# macros which conflict with the ones in
brw_fs_builder.h.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
These cases had the parameter removed:
nir/nir_lower_vec_to_movs.c: In function ‘try_coalesce’:
nir/nir_lower_vec_to_movs.c:124:66: warning: unused parameter ‘shader’ [-Wunused-parameter]
try_coalesce(nir_alu_instr *vec, unsigned start_idx, nir_shader *shader)
^
nir/nir_lower_io.c: In function ‘load_op’:
nir/nir_lower_io.c:147:32: warning: unused parameter ‘state’ [-Wunused-parameter]
load_op(struct lower_io_state *state,
^
These cases had the parameter (void) silenced because the parameter was
necessary for an interface:
nir/glsl_to_nir.cpp:1900:32: warning: unused parameter 'ir' [-Wunused-parameter]
nir_visitor::visit(ir_barrier *ir)
^
nir/nir.c: In function ‘remove_use_cb’:
nir/nir.c:802:35: warning: unused parameter ‘state’ [-Wunused-parameter]
remove_use_cb(nir_src *src, void *state)
^
nir/nir.c: In function ‘remove_def_cb’:
nir/nir.c:811:37: warning: unused parameter ‘state’ [-Wunused-parameter]
remove_def_cb(nir_dest *dest, void *state)
^
Number of total warnings in my build reduced from 2543 to 2538
(reduction of 5).
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
It's what all the call-sites once, so gets rid of a bunch of inlined
glsl_get_base_type() at the call-sites.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
The i965 driver has its own pass for fusing mul+add combinations that's
much smarter than what nir_opt_algebraic can do so we don't want to get the
nir_opt_algebraic one just because we didn't set lower_ffma.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Prep work to reduce the noise in the next patch.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Going to convert this pass to parameterized lower_io_to_temporaries, and
we want the user to be able to specify whether to lower outputs or
inputs or both. The restriction of running this pass before validate
to avoid output reads no longer applies.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
A pass to lower complex (struct/array/mat) inputs/outputs to primitive
types. This allows, for example, linking that removes unused components
of a larger type which is not indirectly accessed.
In the near term, it is needed for gallium (mesa/st) support for NIR,
since only used components of a type are assigned VBO slots, and we
otherwise have no way to represent that to the driver backend. But it
should be useful for doing shader linking in NIR.
v2: use glsl_count_attribute_slots() rather than passing a type_size
fxn pointer
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Handled by tgsi_emulate for glsl->tgsi case.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
This reverts commit 99474dc29b.
-Wpedantic is too verbose, even when applied to just a few includes.
We'll just have to deal with the issues as they come.
Reviewed-by: Brian Paul <brianp@vmware.com>
There are a couple of cycle count changes in shader-db, but it's
basically a wash.
However, with the Broadwell scalar TCS backend enabled, many
Shadow of Mordor shaders benefit from this patch. Because we don't
batch up output writes for TCS, vec4 outputs might not have all
components defined. Many output writes have a value of undef,
which is useless.
With scalar TCS, stats for tessellation shaders on Broadwell:
total instructions in shared programs: 1283000 -> 1280444 (-0.20%)
instructions in affected programs: 34302 -> 31746 (-7.45%)
helped: 71
HURT: 0
total cycles in shared programs: 10798768 -> 10780682 (-0.17%)
cycles in affected programs: 158004 -> 139918 (-11.45%)
helped: 71
HURT: 0
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
shader-db statistics on Broadwell:
total instructions in shared programs: 8963409 -> 8962455 (-0.01%)
instructions in affected programs: 60858 -> 59904 (-1.57%)
helped: 318
HURT: 0
total cycles in shared programs: 71408022 -> 71406276 (-0.00%)
cycles in affected programs: 398416 -> 396670 (-0.44%)
helped: 199
HURT: 51
GAINED: 1
The only shaders affected were in Dota 2 Reborn.
It also sets up for the next optimization.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
This better reflects what it does. I plan to add other ALU
optimizations as well, so the old name would be confusing.
In preparation for that, also move the file comments about csels
above the opt_undef_csel function, and delete the ones about there
not being other optimizations.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
The assert was null checking dest_arr_parent twice. The intention
seems to be to check both dest_ and src_.
Added in d3636da9
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Split 32-bit and 64-bit fmod lowering as the drivers might need to
lower them separately inside NIR depending on the HW support.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
There are rounding errors with the division in i965 that affect
the mod(x,y) result when x = N * y. Instead of returning '0' it
was returning 'y'.
This lowering pass fixes those cases.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Parenthesis are needed here as ! takes precedence over the &. The
check had the opposite effect than intended.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
There is no sense in having the double version of ldexp take a 64-bit
integer. Instead, let's just take a 32-bit int all the time. This also
matches what GLSL does where both variants of ldexp take a regular integer
for the exponent argument.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
The new expressions are more explicit in terms of where the bits go so it's
a little easier to tell what's going on. This is the way GLSL specifies
things so it's a bit easier to verify too. It also has the benifit that
the new expressions easily vectorize so we can constant-fold vector forms
of the _split versions correctly.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
This matches the "foreach x in container" pattern found in many other
programming languages. Generated by the following regular expression:
s/nir_foreach_def(\([^,]*\),\s*\([^,]*\))/nir_foreach_def(\2, \1)/
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This matches the "foreach x in container" pattern found in many other
programming languages. Generated by the following regular expression:
s/nir_foreach_use(\([^,]*\),\s*\([^,]*\))/nir_foreach_use(\2, \1)/
and similar expressions for nir_foreach_use_safe, etc.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This matches the "foreach x in container" pattern found in many other
programming languages. Generated by the following regular expression:
s/nir_foreach_function(\([^,]*\),\s*\([^,]*\))/nir_foreach_function(\2, \1)/
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This matches the "foreach x in container" pattern found in many other
programming languages.
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This matches the "foreach x in container" pattern found in many other
programming languages. Generated by the following regular expression:
s/nir_foreach_phi_src(\([^,]*\),\s*\([^,]*\))/nir_foreach_phi_src(\2, \1)/
and a similar expression for nir_foreach_phi_src_safe.
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
This matches the "foreach x in container" pattern found in many other
programming languages. Generated by the following regular expression:
s/nir_foreach_instr(\([^,]*\),\s*\([^,]*\))/nir_foreach_instr(\2, \1)/
and similar expressions for nir_foreach_instr_safe etc.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>