fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-23 19:28:11 +02:00

Author	SHA1	Message	Date
Jose Fonseca	40a4797384	nir: Use helper macros for dealing with VLAs. v2: - Single statement, by using memset return value as suggested by Ian Romanick. - No internal declaration, as suggested by Jason Ekstrand. - Move macros to a header. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-03-04 10:52:02 +00:00
Jose Fonseca	f320ecf218	nir: Use alloca instead of variable length arrays. This is to enable the code to build with -Werror=vla in the short term, and enable the code to build with MSVC2013 soon after. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-27 14:30:36 +00:00
Kenneth Graunke	8e62bd52f8	nir: Introduce nir_intrinsic_discard_if. This is a conditional discard, which takes a boolean source. Note that we don't generate ir_discard::condition today, so this shouldn't break drivers (since none implement this intrinsic yet). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-02-24 15:24:52 -08:00
Jason Ekstrand	c750ecaa12	nir/register: Add a parent_instr field This adds a parent_instr field similar to the one for ssa_def. The difference here is that the parent_instr field on a nir_register can be NULL if the register does not have a unique definition or if that definition does not dominate all its uses. We set this field in the out-of-SSA pass so that backends can get SSA-like information even after they have gone out of SSA. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-02-24 14:08:04 -08:00
Jason Ekstrand	9b9ef2aeee	nir/gcm: Add some missing break statements Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-23 13:20:13 -08:00
Jason Ekstrand	cb4b2ad44a	nir: Copy-propagate vecN operations that are actually moves We were already do this for ALU operations but we haven't for non-ALU operations. This changes that. total NIR instructions in shared programs: 2039883 -> 2022338 (-0.86%) NIR instructions in affected programs: 1768850 -> 1751305 (-0.99%) helped: 14244 HURT: 124 total FS instructions in shared programs: 4083960 -> 4084036 (0.00%) FS instructions in affected programs: 7302 -> 7378 (1.04%) helped: 12 HURT: 51 Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-23 13:19:05 -08:00
Eric Anholt	4359954d84	nir: Generalize the optimization of subs of subs from 0. I initially wrote this based on the "(('fneg', ('fneg', a)), a)" above, but we can generalize it and make it more potentially useful. In the specific original case of a 0 for our new 'a' argument, it'll get further algebraic optimization once the 0 is an argument to the new add. No shader-db effects. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-21 14:57:14 -08:00
Eric Anholt	345c2b288a	nir: Collapse repeated bcsels on the same argument. vc4 results: total instructions in shared programs: 39881 -> 39794 (-0.22%) instructions in affected programs: 6302 -> 6215 (-1.38%) Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-21 14:57:14 -08:00
Eric Anholt	a38038ca5e	nir: When faced with a csel on !condition, just flip the arguments. total NIR instructions in shared programs: 39426 -> 39411 (-0.04%) NIR instructions in affected programs: 3748 -> 3733 (-0.40%) Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-21 14:57:14 -08:00
Eric Anholt	8e1152cb33	nir: Allow nir_opt_algebraic to see booleanness through &&, \|\|, ^, !. We have some useful optimizations to drop things like 'ine a, 0' on a boolean argument, but if 'a' came from logical operations on bools, it couldn't tell. These kinds of constructs appear as a result of TGSI->NIR quite frequently (at least with if flattening), so being a little more aggressive in detecting booleans can pay off. v2: Add ixor as a booleanness-preserving op (Suggestion by Connor). vc4 results: total instructions in shared programs: 40207 -> 39881 (-0.81%) instructions in affected programs: 6677 -> 6351 (-4.88%) Reviewed-by: Matt Turner <mattst88@gmail.com> (v1) Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-21 14:57:14 -08:00
Eric Anholt	dc982f4a85	nir: Add a couple of simplifications of csel operations. vc4 was already cleaning these up, but it does shave 4 NIR instructions in shader-db. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-21 14:57:14 -08:00
Kenneth Graunke	b6393d7040	nir: Fix the Mesa build without -DDEBUG. With -DDEBUG -UNDEBUG, this assert uses reg_state::stack_size, which doesn't exist, breaking the build: assert(state->states[index].index < state->states[index].stack_size); Switch it to ifndef NDEBUG, so the field will exist if the assertion actually generates code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-20 13:43:44 -08:00
Eric Anholt	bef38f62e0	nir: Drop dependency on mtypes.h for core NIR. One less new directory necessary for gallium code that wants to interact with NIR. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-02-20 11:36:34 -08:00
Eric Anholt	b53d035825	util: Move Mesa's bitset.h to util/. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-02-20 11:36:34 -08:00
Jason Ekstrand	c7002fad90	nir/GCM: Pull unpinned instructions out of blocks while pinning This lets us be slightly more efficient by not walking the CFG extra times. Also, it may make it easier to ensure that GVN happens on only unpinned instructions. Reviewed-by: Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-19 17:06:17 -08:00
Jason Ekstrand	8dfe6f672f	nir/GCM: Use pass_flags instead of bitsets for tracking visited/pinned Reviewed-by: Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-19 17:06:17 -08:00
Jason Ekstrand	190073c737	nir: Add a global code motion (GCM) pass v2 Jason Ekstrand <jason.ekstrand@intel.com>: - Use nir_dominance_lca for computing least common anscestors - Use the block index for comparing dominance tree depths - Pin things that do partial derivatives Reviewed-by: Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-19 17:06:17 -08:00
Jason Ekstrand	a52a4b5223	nir/instr: Change "live" to a more generic "pass_flags" field Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-19 17:06:17 -08:00
Jason Ekstrand	3d25afc51c	nir: Make nir_[cf_node/instr]_[prev/next] return null if at the end Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-19 17:06:17 -08:00
Jason Ekstrand	902b0ccc9a	nir/from_ssa: Don't try to read an invalid instruction Right now, the nir_instr_prev function function blindly looks up the previous element in the exec list and casts it to an instruction even if it's the tail sentinel. The next commit will change this to return null if it's the first instruction. Making this change first avoids getting a segfault between commits. The only reason we never noticed is that, thanks to the way things are laid out in nir_block, the casted instruction's type was never parallal_copy. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-19 17:06:17 -08:00
Jason Ekstrand	0281fd0786	nir/validate: Validate SSA defs the same way we do for registers Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-19 17:06:17 -08:00
Jason Ekstrand	34952b5671	nir/validate: Validate if_uses on registers Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-19 17:06:17 -08:00
Jason Ekstrand	98ecb25f89	nir: Properly clean up CF nodes when we remove them Previously, if you remved a CF node that still had instructions in it, none of the use/def information from those instructions would get cleaned up. Also, we weren't removing if statements from the if_uses of the corresponding register or SSA def. This commit fixes both of these problems Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-19 17:06:17 -08:00
Jason Ekstrand	e025943134	nir: use nir_foreach_ssa_def for indexing ssa defs This is both simpler and more correct. The old code didn't properly index load_const instructions. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-19 17:06:17 -08:00
Jason Ekstrand	0167c38cac	nir/from_ssa: Use the nir_block_dominance function instead of our own Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-19 17:06:17 -08:00
Jason Ekstrand	f481a9425c	nir/dominance: Add a constant-time mechanism for comparing blocks This is mostly thanks to Connor. The idea is to do a depth-first search that computes pre and post indices for all the blocks. We can then figure out if one block dominates another in constant time by two simple comparison operations. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-19 17:06:17 -08:00
Jason Ekstrand	b4c5489c8a	nir/dominance: Expose the dominance intersection function Being able to find the least common anscestor in the dominance tree is a useful thing that we may want to do in other passes. In particular, we need it for GCM. v2: Handle NULL inputs by returning the other block Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-19 17:06:16 -08:00
Brian Paul	2f5597787c	nir: add missing GLSL_TYPE_DOUBLE case in type_size() To silence compiler warning about unhandled switch case. v2: move GLSL_TYPE_DOUBLE to the "not reached" section, per Ilia. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 15:36:59 -07:00
Eric Anholt	2a135c470e	nir: Add an ALU op builder kind of like ir_builder.h v2: Rebase on the nir_opcodes.h python code generation support. v3: Use SSA values, and set an appropriate writemask on dot products. v4: Make the arguments be SSA references as well. This lets you stack up expressions in the arguments of other expressions, at the cost of having to insert a fmov/imov if you want to swizzle. Also, add the generated file to NIR_GENERATED_FILES. v5: Use more pythonish style for iterating the list. v6: Infer the size of the dest from the size of the srcs, and auto-swizzle a single small src out to the appropriate size. v7: Add little helpers for initializing the struct, add a typedef for the struct like other nir types have. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v6) Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v7)	2015-02-18 22:28:42 -08:00
Eric Anholt	6eadde51bb	nir: Recognize and reduce duplicated fsats. No effect on vc4 shader-db. v2: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2015-02-18 14:47:51 -08:00
Eric Anholt	1907a3a7ee	nir: Add a flag for lowering fsat. vc4 cse/algebraic-disabled stats: total instructions in shared programs: 44356 -> 44354 (-0.00%) instructions in affected programs: 55 -> 53 (-3.64%) v2: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2015-02-18 14:47:51 -08:00
Eric Anholt	e5ecf8e427	nir: Add a flag for lowering ffma. vc4 cse/algebraic-disabled stats: total uniforms in shared programs: 13966 -> 13791 (-1.25%) uniforms in affected programs: 435 -> 260 (-40.23%) total instructions in shared programs: 44732 -> 44356 (-0.84%) instructions in affected programs: 9599 -> 9223 (-3.92%) v2: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2015-02-18 14:47:51 -08:00
Eric Anholt	42a8ace66e	nir: Add a flag for lowering fneg/ineg. vc4 cse/algebraic-disabled stats: total instructions in shared programs: 44911 -> 44732 (-0.40%) instructions in affected programs: 11371 -> 11192 (-1.57%) v2: Fix broken iabs(isub(0, a)) transformation. v3: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2015-02-18 14:47:51 -08:00
Eric Anholt	cb95a228e8	nir: Add a flag for lowering fsqrt(x) to frcp(frsqrt(x)). vc4 cse/algebraic-disabled stats: total uniforms in shared programs: 13972 -> 13966 (-0.04%) uniforms in affected programs: 408 -> 402 (-1.47%) total instructions in shared programs: 44973 -> 44911 (-0.14%) instructions in affected programs: 1551 -> 1489 (-4.00%) v2: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2015-02-18 14:47:50 -08:00
Eric Anholt	ccf14bca4b	nir: Add lowering of POW instructions if the lower flag is set. This could be done in a separate pass like we do in GLSL IR, but it seems to me like having the definitions of the transformations in the two directions next to each other makes a lot of sense. v2: Reorder the comment about the transformation. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-18 14:47:50 -08:00
Eric Anholt	8e9dbfff17	nir: Conditionalize the POW reconstruction on shader compiler options. Mesa has a shader compiler struct flagging whether GLSL IR's opt_algebraic and other passes should try and generate certain types of opcodes or patterns. Extend that to NIR by defining our own struct, which is automatically generated from the Mesa struct in glsl_to_nir and provided directly by the driver in TGSI-to-NIR. v2: Split out the previous two prep patches. v3: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v2)	2015-02-18 14:47:50 -08:00
Eric Anholt	955a6bb57d	nir: Add an optional expression controlling nir_algebraic xforms. This will be used so that we can customize the transforms for the target GPU, so we don't un-lower expressions that had already been lowered (or introduce new lowering transformations that not all GPUs want) v2: Drop the complication of having the condition->index dictionary, since we don't actually expect there to be many different conditions (change by Kenneth). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-18 14:47:50 -08:00
Eric Anholt	f90bb54734	nir: Add a nir_shader_compiler_options struct pointed to by the shaders. This will be used to give the optimization passes a chance to customize behavior for the particular target device. v2: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2015-02-18 14:47:50 -08:00
Jason Ekstrand	dd110cdfd8	nir: Make gl_FrontFacing a system_value GLSL IR labels gl_FrontFacing as an input variable and not a system value. This commit makes NIR silently translate gl_FrontFacing to a system value so that it properly gets translated into a load_system_value intrinsic. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-14 13:47:16 -08:00
Jason Ekstrand	929f43851e	nir/lower_phis_to_scalar: Fix some logic in is_phi_scalarizable Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-14 13:46:59 -08:00
Matt Turner	4c42e1116b	nir: Recognize open-coded fmin/fmax. And unfortunately other shaders do the same thing but with >=/<= which we can't apply this optimization to because of NaNs. instructions in affected programs: 23309 -> 22938 (-1.59%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-11 13:50:19 -08:00
Eric Anholt	56e21647e2	nir: Add algebraic opt for int comparisons with identical operands. No change on shader-db on i965. v2: Reword the comment due to feedback from Erik Faye-Lund Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v1) Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> (v1)	2015-02-11 11:52:38 -08:00
Eric Anholt	2919bdf466	nir: Fix load_const comparisons for CSE. We want the size of a float per component, not the size of a whole vec4. NIR instructions on i965: total instructions in shared programs: 1261937 -> 1261929 (-0.00%) instructions in affected programs: 114 -> 106 (-7.02%) Looking at one of these examples (tesseract), it's from vec4 load_consts for a MRT solid fill, which do get CSEed now that we don't memcmp off the end of the const value and into the SSA def. For the 1-component loads that are common in i965, we were only memcmping off into the rest of the usually zero-filled const_value. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-11 11:52:38 -08:00
Matt Turner	a9065cef48	nir: Remove casts from void*. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-10 17:48:42 -08:00
Matt Turner	bb1e007157	nir: Replace assert(0) with unreachable(). Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-10 17:48:31 -08:00
Matt Turner	942b56ad05	nir: Remove unused has_indirect variable. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-02-10 17:48:16 -08:00
Kenneth Graunke	480ee1f0b4	nir: Mark nir_print_instr's instr pointer as const. Printing instructions doesn't modify them, so we can mark the parameter const. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-02-10 03:37:55 -08:00
Eric Anholt	bff4cbdafa	nir: Fix broken fsat recognizer. We've probably never seen this ridiculous pattern in the wild, so it didn't matter. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-02-06 15:57:55 -08:00
Eric Anholt	6706537dd4	nir: Slightly simplify algebraic code generation by reusing a struct. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-02-06 15:57:55 -08:00
Connor Abbott	a135f34080	nir: add an optimization to remove useless phi nodes This removes phi nodes whose sources all point to the same thing. Shader-db results: total NIR instructions in shared programs: 2045293 -> 2041209 (-0.20%) NIR instructions in affected programs: 126564 -> 122480 (-3.23%) helped: 615 HURT: 0 total FS instructions in shared programs: 4321840 -> 4320392 (-0.03%) FS instructions in affected programs: 24622 -> 23174 (-5.88%) helped: 138 HURT: 0 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Tested-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-03 16:00:13 -05:00

1 2 3 4 5

231 commits