fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-24 14:58:10 +02:00

Author	SHA1	Message	Date
Eric Anholt	9a3a60cb13	nir: Don't try to to-SSA ALU instructions that are already SSA. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-29 11:43:33 -08:00
Eric Anholt	68d476167c	nir: Fix a bit of broken indentation. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-29 11:42:08 -08:00
Eric Anholt	36c604c824	nir: Add a couple of helpers for glsl types. This will be used by tgsi_to_nir, which needs to get vec4 types for declaring shader input/output variables. v2: Add a missing space. Reviewed-by: Matt Turner <mattst88@gmail.com> (v2) Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-29 11:41:17 -08:00
Eric Anholt	dd4d9a4e62	nir: Make vec-to-movs handle src/dest aliasing. It now emits vector MOVs instead of a series of individual MOVs, which should be useful to any vector backends. This pushes the problem of src/dest aliasing of channels on a scalar chip to the backend, but if there are any vector operations in your shader then you needed to be handling this already. Fixes fs-swap-problem with my scalarizing patches. v2: Rename to insert_mov(), and add a comment about what it does. v3: Rewrite the comment. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v3)	2015-01-28 16:33:34 -08:00
Jason Ekstrand	bb26ebac13	nir/opcodes: Use a return type of tfloat for ldexp Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-28 13:21:40 -08:00
Jason Ekstrand	f0340ff625	Revert "nir/opcodes: Use fpclassify() instead of isnormal() for ldexp" This reverts commit `d7d340fb2f`. We have an isnormal() implementation available, the only problem was that we had the wrong return type (fixed in a later patch). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88806 Acked-by: Matt Turner <mattst88@gmail.com>	2015-01-28 13:19:47 -08:00
Jason Ekstrand	d7d340fb2f	nir/opcodes: Use fpclassify() instead of isnormal() for ldexp Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88806 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-01-28 03:42:41 -08:00
Connor Abbott	f1a9252def	nir: fix a bug with constant folding non-per-component instructions Before, we were only copying the first N channels, where N is the size of the SSA destination, which is fine for per-component instructions, but non-per-component instructions like fdot3 can have more source components than destination components. Fix this using the helper function introduced in the last patch. v2: use new helper name Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-26 21:26:36 -05:00
Connor Abbott	816f0515a2	nir: add a helper function for getting the number of source components Unlike with non-SSA ALU instructions, where if they're per-component you have to look at the writemask to know which source channels are being used, SSA ALU instructions always have all the possible channels enabled so we can just look at the number of components in the SSA definition for per-component instructions to say how many source components are being used. v2: use new name nir_ssa_alu_instr_src_components() Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-26 21:26:36 -05:00
Jason Ekstrand	dd74369a0a	nir/opcodes: Don't go through doubles when constant-folding iabs Previously, we called the abs() function in math.h. However, this involves unnecessarily going through double. This commit changes it to use integers directly with a ternary. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-01-26 11:25:02 -08:00
Jason Ekstrand	9bd28fe3a3	nir/opcodes: Simplify and fix the unpack_half__split_ constant expressions Previously, these functions were explicitly writing to dst.x and dst.y. However they both return only one component so writing to dst.y is invalid. Also, since they only return one component, we don't need the explicit assignment in the expression and can simplify it use an implicit assignment. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-26 11:25:02 -08:00
Jason Ekstrand	27c6e3e4ca	nir: Use pointers for nir_src_copy and nir_dest_copy This avoids the overhead of copying structures and better matches the newly added nir_alu_src_copy and nir_alu_dest_copy. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-26 11:24:58 -08:00
Connor Abbott	0aa31bf9c3	nir/constant_folding: use the new constant folding infrastructure Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-24 21:35:35 -08:00
Jason Ekstrand	89285e4d47	nir: add new constant folding infrastructure Add a required field to the Opcode class, const_expr, that contains an expression or statement that computes the result of the opcode given known constant inputs. Then take those const_expr's and expand them into a function that takes an opcode and an array of constant inputs and spits out the constant result. This means that when adding opcodes, there's one less place to update, and almost all the opcodes are self-documenting since the information on how to compute the result is right next to the definition. The helper functions in nir_constant_expressions.c were taken from ir_constant_expressions.cpp. v3 Jason Ekstrand <jason.ekstrand@iastate.edu> - Use mako to generate one function per opcode instead of doing piles of string splicing v4 Jason Ekstrand <jason.ekstrand@iastate.edu> - More comments and better indentation in the mako - Add a description of the constant expression language in nir_opcodes.py - Added nir_constant_expressions.py to EXTRA_DIST in Makefile.am Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-24 21:35:35 -08:00
Connor Abbott	fa4bc6c130	nir: use Python to autogenerate opcode information Before, we used a system where a file, nir_opcodes.h, defined some macros that were included to generate the enum values and the nir_op_infos structure. This worked pretty well, but for development the error messages were never very useful, Python tools couldn't understand the opcode list, and it was difficult to use nir_opcodes.h to do other things like autogenerate a builder API. Now, we store opcode information in nir_opcodes.py, and we have nir_opcodes_c.py to generate the old nir_opcodes.c and nir_opcodes_h.py to generate nir_opcodes.h, which contains all the enum names and gets included into nir.h like before. In addition to solving the above problems, using Python and Mako to generate everything means that it's much easier to add keep information centralized as we add new things like constant propagation that require per-opcode information. v2: - make Opcode derive from object (Dylan) - don't use assert like it's a function (Dylan) - style fixes for fnoise, use xrange (Dylan) - use iterkeys() in nir_opcodes_h.py (Dylan) - use pydoc-style comments (Jason) - don't make fmin/fmax commutative and associative yet (Jason) Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> v3 Jason Ekstrand <jason.ekstrand@intel.com> - Alphabetize source file lists - Generate nir_opcodes.h in the builddir instead of the source dir - Include $(builddir)/src/glsl/nir in the i965 build - Rework nir_opcodes.h generation so it generates a complete header file instead of one that has to be embedded inside an enum declaration	2015-01-24 21:33:56 -08:00
Eric Anholt	0680d170d1	nir: Expose nir_print_instr() for debug prints It's nice to have this present in your default cases so you can see what instruction is triggering an abort. v2: Just pass a NULL state, now that it won't crash when you do. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-23 17:30:11 -08:00
Eric Anholt	6445a40520	nir: When asked to print with a NULL state, just use bare variable names. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-23 17:30:01 -08:00
Eric Anholt	447ddfc137	nir: Add nir_lower_alu_to_scalar. This is the equivalent of brw_fs_channel_expressions.cpp, which I wanted for vc4. v2: Use the nir_src_for_ssa() helper, and another instance of nir_alu_src_copy(). v3: Drop the non-SSA support. All intended callers will have SSA-only ALU ops. v4: Use insert_before, drop stale bcsel/fcsel comment, drop now-unused unsupported() function, drop lower_context struct. v5: Completely rename the pass to nir_lower_alu_to_scalar(), add an assert about weird input_sizes[]. Reviewed-by: Jason Ekstrand <jason.ekstrand@iastate.edu>	2015-01-23 16:37:23 -08:00
Eric Anholt	b200127816	nir: Make some helpers for copying ALU src/dests. There aren't many users yet, but I wanted to do this from my scalarizing pass. v2: Constify the src arguments. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-23 16:37:16 -08:00
Kenneth Graunke	15063d2ad0	nir: Add algebraic optimizations for division and reciprocal. These also exist in opt_algebraic.cpp. total NIR instructions in shared programs: 2011430 -> 2011211 (-0.01%) NIR instructions in affected programs: 42221 -> 42002 (-0.52%) helped: 198 total i965 instructions in shared programs: 6020553 -> 6020116 (-0.01%) i965 instructions in affected programs: 84322 -> 83885 (-0.52%) helped: 394 HURT: 1 (by 1 instruction) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-23 14:53:26 -08:00
Kenneth Graunke	bbd60f6d79	nir: Add algebraic optimizations for exponential/logarithmic functions. Most of these exist in the GLSL IR algebraic pass already. However, SSA allows us to find more instances of the patterns. total NIR instructions in shared programs: 2015593 -> 2011430 (-0.21%) NIR instructions in affected programs: 124189 -> 120026 (-3.35%) helped: 604 total i965 instructions in shared programs: 6025505 -> 6018717 (-0.11%) i965 instructions in affected programs: 261295 -> 254507 (-2.60%) helped: 1295 HURT: 3 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-23 14:53:26 -08:00
Kenneth Graunke	391fb32bbe	nir: Add algebraic optimizations for simplifying comparisons. The first batch removes bonus fnot/inot operations, possibly allowing other optimizations to better recognize patterns. The next batch replaces a fadd and constant 0.0 with an fneg - negation is usually free on GPUs, while addition is not. total NIR instructions in shared programs: 2020814 -> 2015593 (-0.26%) NIR instructions in affected programs: 411143 -> 405922 (-1.27%) helped: 2233 HURT: 214 A few shaders are hurt by a few instructions due to moving neg such that it has a constant operand, which is then folded, resulting in two distinct load_consts for x and -x. We can always clean that up later. total i965 instructions in shared programs: 6035392 -> 6025505 (-0.16%) i965 instructions in affected programs: 784980 -> 775093 (-1.26%) helped: 4508 HURT: 2 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-23 14:53:26 -08:00
Kenneth Graunke	551a752a59	nir: Add algebraic optimizations for pointless shifts. The GLSL IR optimization pass contained these; we may as well include them too. v2: Fix a >> 0 and a << 0 optimizations (caught by Matt). No change in the number of NIR instructions on a shader-db run. total i965 instructions in shared programs: 6035397 -> 6035392 (-0.00%) i965 instructions in affected programs: 542 -> 537 (-0.92%) helped: 2 (in glamor) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-23 14:53:26 -08:00
Kenneth Graunke	3e56572c49	nir: Add a bunch of algebraic optimizations on logic/bit operations. Matt and I noticed a bunch of "val <- ior a a" operations in a shader, so we decided to add an algebraic optimization for that. While there, I decided to add a bunch more of them. v2: Delete bogus fand/for optimizations (caught by Jason). total NIR instructions in shared programs: 2023511 -> 2020814 (-0.13%) NIR instructions in affected programs: 149634 -> 146937 (-1.80%) helped: 1032 total i965 instructions in shared programs: 6035392 -> 6035397 (0.00%) i965 instructions in affected programs: 537 -> 542 (0.93%) HURT: 2 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-23 14:53:26 -08:00
Kenneth Graunke	978b0a9cda	nir: Implement CSE on intrinsics that can be eliminated and reordered. Matt and I noticed that one of the shaders hurt by INTEL_USE_NIR=1 had load_input and load_uniform intrinsics repeated several times, with the same parameters, but each one generating a distinct SSA value. This made ALU operations on those values appear distinct as well. Generating distinct SSA values is silly - these are read only variables. CSE'ing them makes everything use a single SSA value, which then allows other operations to be CSE'd away as well. Generalizing a bit, it seems like we should be able to safely CSE any intrinsics that can be eliminated and reordered. I didn't implement support for variables for the time being. v2: Assert that info->num_variables == 0 (requested by Jason). total NIR instructions in shared programs: 2435936 -> 2023511 (-16.93%) NIR instructions in affected programs: 2413496 -> 2001071 (-17.09%) helped: 16872 total i965 instructions in shared programs: 6028987 -> 6008427 (-0.34%) i965 instructions in affected programs: 640654 -> 620094 (-3.21%) helped: 2071 HURT: 585 GAINED: 14 LOST: 25 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-23 14:53:26 -08:00
Kenneth Graunke	cbdd623f13	nir: Pull nir_instr_can_cse()'s SSA checks out of the switch. This should not be a change in behavior, as all current cases that potentially answer "yes" require SSA. The next patch will introduce another case that requires SSA. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-23 14:53:26 -08:00
Connor Abbott	68a9d0b36f	nir: add generated file to .gitignore Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-23 10:20:46 -08:00
Eric Anholt	fc6938d23e	nir: Fix setup of constant bool initializers. brw_fs_nir has only seen scalar bools so far, thanks to vector splitting, and the ralloc of in glsl_to_nir.cpp will usually get you a 0-filled chunk of memory, so reading too large of a value will usually get you the right bool value. But once we start doing vector bools in a few commits, we end up getting bad values. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-22 13:52:19 -08:00
Eric Anholt	534a4ec82f	nir: Make an easier helper for setting up SSA defs. Almost all instructions we nir_ssa_def_init() for are nir_dests, and you have to keep from forgetting to set is_ssa when you do. Just provide the simpler helper, instead. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-22 13:52:19 -08:00
Matt Turner	28b7c6b285	nir: Replace assert(0) with unreachable(). Fixes a couple of warnings in the process. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-21 21:06:37 -08:00
Jason Ekstrand	f88c6a4997	nir: Stop using designated initializers Designated initializers with anonymous unions don't work in MSVC or GCC < 4.6. With a couple of constructor methods, we don't need them any more and the code is actually cleaner. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88467 Reviewed-by: Connor Abbot <cwabbott0@gmail.com>	2015-01-21 19:55:02 -08:00
Jason Ekstrand	7da60eca4f	nir: Add src and dest constructors Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-21 12:21:10 -08:00
Jason Ekstrand	194f6235b3	nir: Add a nir_foreach_phi_src helper macro Reviewed-by: Connor Abbott <cwabbott02gmail.com>	2015-01-20 16:53:29 -08:00
Vinson Lee	10a4f1e77a	nir: s/malloc.h/stdlib.h/ Fix build error on Mac OS X. CC nir_to_ssa.lo nir_to_ssa.c:29:10: fatal error: 'malloc.h' file not found ^ Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88478 Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2015-01-16 16:14:51 -08:00
Jason Ekstrand	bc6e57e019	nir/live_variables: Use a worklist This is a rework of the liveness algorithm using a worklist as suggested by Connor. Doing so reduces the number of times we walk over the instructions because we don't have to do an entire pointless walk over the instructions just to figure out it's time to stop. Also, the stuff after the last loop in the funciton will only ever get visited once. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 16:54:21 -08:00
Jason Ekstrand	4839d1aed1	nir: Add a worklist helper structure A worklist is a common concept in optimizations. This adds a structure that we can reuse for many different types of optimizations. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 16:54:21 -08:00
Brian Paul	0aaaa13ec9	nir: fix incorrect argument passed to validate_src() in validate_tex_instr() Silences a compiler warning. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 17:41:42 -07:00
Brian Paul	aa479a69d6	nir: silence compiler warning from visit_src() call v2: use proper argument Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 17:09:02 -07:00
Jason Ekstrand	153b8b3525	util/hash_set: Rework the API to know about hashing Previously, the set API required the user to do all of the hashing of keys as it passed them in. Since the hashing function is intrinsically tied to the comparison function, it makes sense for the hash set to know about it. Also, it makes for a somewhat clumsy API as the user is constantly calling hashing functions many of which have long names. This is especially bad when the standard call looks something like _mesa_set_add(ht, _mesa_pointer_hash(key), key); In the above case, there is no reason why the hash set shouldn't do the hashing for you. We leave the option for you to do your own hashing if it's more efficient, but it's no longer needed. Also, if you do do your own hashing, the hash set will assert that your hash matches what it expects out of the hashing function. This should make it harder to mess up your hashing. This is analygous to `94303a0750` where we did this for hash_table Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-01-15 13:21:27 -08:00
Jason Ekstrand	4c99e3ae78	util: Move main/set to util/hash_set Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-01-15 13:21:27 -08:00
Jason Ekstrand	8ed5305d28	hash_table: Rename insert_with_hash to insert_pre_hashed We already have search_pre_hashed. This makes the APIs match better. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-01-15 13:21:27 -08:00
Jason Ekstrand	0d05d1226e	nir/algebraic: Only replace an instruction once Without the break, it was possible that an instruction would match multiple expressions. If this happened, you could end up trying to replace it multiple times and get a segfault. This makes it so that, after a successful replacement, it moves on to the next instruction. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:24 -08:00
Jason Ekstrand	0f85310975	nir/vars_to_ssa: Use the copy lowering from lower_var_copies Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:24 -08:00
Jason Ekstrand	d3636da902	nir: Add a pass for lowering copy instructions Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:24 -08:00
Jason Ekstrand	700ba5daaf	nir/vars_to_ssa: Refactor get_deref_node This refactor allows you to more easily get the deref node associated with a given variable. We then use that new functionality in the deref_may_be_aliased function instead of creating a 1-element deref chain. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:24 -08:00
Jason Ekstrand	55b5058e69	nir: Rename lower_variables to lower_vars_to_ssa The original name wasn't particularly descriptive. This one indicates that it actually gives you SSA values as opposed to the old pass which lowered variables to registers. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:24 -08:00
Jason Ekstrand	4aa6162f6e	nir/tex_instr: Add a nir_tex_src struct and dynamically allocate the src array This solves a number of problems. First is the ability to change the number of sources that a texture instruction has. Second, it solves the delema that may occur if a texture instruction has more than 4 sources. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:24 -08:00
Jason Ekstrand	dcb1acdea0	nir/validate: Only build in debug mode Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:24 -08:00
Jason Ekstrand	347ab2bf24	nir/lower_variables: Improve documentation Additional description was added to a variety of places. Also, we no longer use the term "leaf" to describe fully-qualified direct derefs. Instead, we simply use the term "direct" or spell it out completely. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:23 -08:00
Jason Ekstrand	8016fa39e1	nir/lower_variables: Use a for loop for get_deref_node Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:23 -08:00

1 2 3 4

169 commits