fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-29 22:48:12 +02:00

Author	SHA1	Message	Date
Jason Ekstrand	fe399f3a69	nir/info: Move the GS info into a stage-specific info union This way we can have other stage-specific info without consuming too much extra space. While we're at it, we make sure that the geometry info is only set if we're actually a goemetry shader. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-10-19 08:47:03 -07:00
Jason Ekstrand	16619477bc	mesa: Move gl_frag_depth_layout from mtypes.h to shader_enums.h Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-10-19 08:47:03 -07:00
Jason Ekstrand	5d4bc5ec13	nir: Add a label to nir_shader_info Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-10-19 08:45:14 -07:00
Rob Clark	b9b40ef9b7	nir: remove dependency on glsl Move glsl_types into NIR, now that the dependency on glsl_symbol_table has been split out. Possibly makes sense to rename things at this point, but if we do that I'd like to keep it split out into a separate patch to make git history easier to follow (IMHO). v2: fix android build v3: I f***ing hate scons.. but at least it builds Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-10-16 19:33:38 -04:00
Rob Clark	183db3a645	glsl: move half<->float convertion to util Needed in NIR too, so move out of mesa/main/imports.c Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-10-16 19:33:37 -04:00
Rob Clark	33de998230	glsl: couple shader_enums cleanups Add missing enum to gl_system_value_name() and move VARYING_SLOT_MAX / FRAG_RESULT_MAX / etc into shader_enums.h as suggested by Emil. v2: add STATIC_ASSERT()'s Reported-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-10-16 19:33:37 -04:00
Timothy Arceri	3c87377d0b	nir: add atomic lowering support for AoA Cc: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-10-17 08:43:21 +11:00
Timothy Arceri	2e1798f183	nir: wrapper for glsl_type arrays_of_arrays_size() Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-17 08:43:15 +11:00
Iago Toral Quiroga	c8f5274b52	nir: Get the number of SSBOs and UBOs right Before `d31f98a272` and `56e2bdbca3` we had a sigle index space for UBOs and SSBOs, so NumBufferInterfaceBlocks would contain the combined number of blocks, not just one kind. This means that for shader programs using both UBOs and SSBOs, we were setting num_ssbos and num_ubos to a larger number than we should. Since the above commits we have separate index spaces for each so we can just get the right numbers. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-10-16 10:12:44 +02:00
Jason Ekstrand	b705005584	nir/glsl: Use shader_prog->Name for naming the NIR shader This has the better name to use. Aparently, sh->Name is usually 0. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Neil Roberts <neil@linux.intel.com>	2015-10-15 07:31:09 -07:00
Jason Ekstrand	eb893c220c	nir: Add helpers for creating variables and adding them to lists Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-15 07:31:09 -07:00
Iago Toral Quiroga	27dccf097d	mesa: Rename {Num}UniformBlocks to {Num}BufferInterfaceBlocks Currently, these arrays in gl_shader and gl_shader_program hold both UBOs and SSBOs, so this looks like a better name. We were already using NumBufferInterfaceBlocks in gl_shader_program, so this makes things more consistent as well. In a later patch we will add {Num}UniformBlocks and {Num}ShaderStorageBlocks which will contain only references to UBOs and SSBOs respectively that will provide backends with a separate index space for both types of objects. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-10-14 08:11:13 +02:00
Iago Toral Quiroga	baee16bf02	nir: split SSBO min/max atomic instrinsics into signed/unsigned versions NIR is typeless so this is the only way to keep track of the type to select the proper atomic to use. v2: - Use imin,imax,umin,umax for the intrinsic names (Connor Abbott) - Change message for unreachable paths (Michael Schellenberger) Tested-by: Markus Wick <markus@selfnet.de> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-14 08:03:58 +02:00
Rob Clark	c9b982b72d	glsl: move shader_enums into nir First step towards inverting the dependency between glsl and nir (so nir can be used without glsl). Also solves this issue with 'make distclean' Making distclean in mesa make[2]: Entering directory '/mnt/sdb1/Src64/Mesa-git/mesa/src/mesa' Makefile:2486: ../glsl/.deps/shader_enums.Plo: No such file or directory make[2]: * No rule to make target '../glsl/.deps/shader_enums.Plo'. Stop. make[2]: Leaving directory '/mnt/sdb1/Src64/Mesa-git/mesa/src/mesa' Makefile:684: recipe for target 'distclean-recursive' failed make[1]: * [distclean-recursive] Error 1 make[1]: Leaving directory '/mnt/sdb1/Src64/Mesa-git/mesa/src' Makefile:615: recipe for target 'distclean-recursive' failed make: *** [distclean-recursive] Error 1 Reported-by: Andy Furniss <adf.lists@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-10-09 15:03:28 -04:00
Connor Abbott	bb59ba8634	nir/instr_set: remove unnecessary check in nir_instrs_equal() This was originally added to nir_instrs_equal() instead of nir_instr_can_cse() incorrectly, but this was fixed when moving to the instruction set API (as it had to be, otherwise hashing wouldn't work). Now, this is dead code since instr_can_rewrite() will only return true for texture instructions that use an index, so we can turn the check into an assert. This also means that now nir_instrs_equal(instr, instr) will always return true unless it assert-fails. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2015-10-09 10:15:28 -04:00
Connor Abbott	bf5f931aee	nir: make nir_instrs_equal() static This was previously tied to CSE, since it would only work for instructions where nir_can_cse() (now instr_can_rewrite()) returned true. Now that CSE uses the instruction set abstraction which only uses this internally, we can make it local to nir_instr_set.c. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2015-10-09 10:15:15 -04:00
Connor Abbott	e8308d0523	nir/cse: use the instruction set API This replaces an O(n^2) algorithm with an O(n) one, while allowing us to import most of the infrastructure required for GVN. The idea is to walk the dominance tree depth-first, similar when converting to SSA, and remove the instructions from the set when we're done visiting the sub-tree of the dominance tree so that the only instructions in the set are the instructions that dominate the current block. No piglit regressions. No shader-db changes. Compilation time for full shader-db: Difference at 95.0% confidence -35.826 +/- 2.16018 -6.2852% +/- 0.378975% (Student's t, pooled s = 3.37504) v2: - rebase on start_block removal - remove useless state struct - change commit message Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2015-10-09 10:14:42 -04:00
Connor Abbott	523a28d3fe	nir: add an instruction set API This will replace direct usage of nir_instrs_equal() in the CSE pass, which reduces an O(n^2) algorithm with an effectively O(n) one. It'll also be useful for implementing GVN on top of GCM. v2: - Add texture support. - Add more comments. - Rename instr_can_hash() to instr_can_rewrite() since it's really more about whether its uses can be rewritten, and it's implicitly used by nir_instrs_equal() as well. - Rename nir_instr_set_add() to nir_instr_set_add_or_rewrite() (Jason). - Make the HASH() macro less magical (Topi). - Rewrite the commit message. v3: - For sorting phi sources, use a VLA, store pointers to the sources, and compare the predecessor pointer directly (Jason). Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2015-10-09 10:14:35 -04:00
Connor Abbott	005c2efb7b	nir: constify instruction comparison functions v2: rebase, don't constify nir_srcs_equal() as it's pass-by-value anyways Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2015-10-09 10:14:28 -04:00
Connor Abbott	d6bc35934f	nir: constify nir_ssa_alu_instr_src_components() Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2015-10-09 10:14:20 -04:00
Connor Abbott	20d6d812dc	nir: split out instruction comparison functions Right now nir_instrs_equal() is tied pretty tightly to CSE, but we're going to introduce the idea of an instruction set and tie it to that instead. In anticipation of that, move this into its own file where we'll add the rest of the instruction set implementation later. v2: Rebase on texture support. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2015-10-09 10:13:27 -04:00
Neil Roberts	886d46b089	nir: Add a function to determine if a source is dynamically uniform Adds nir_src_is_dynamically_uniform which returns true if the source is known to be dynamically uniform. This will be used in a later patch to add a workaround for cases that only work with dynamically uniform sources. Note that the function is not definitive, it can return false negatives (but not false positives). Currently it only detects constants and uniform accesses. It could easily be extended to include more cases. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-09 15:10:40 +02:00
Jason Ekstrand	9c528f5dfa	nir/sweep: Reparent the shader name Previously the name of the nir shader was being freed prematurely during nir_sweep. Since `756613ed35` the name was later being used to generate filenames for the optimiser debug output and these would end up with garbage from the dangling pointer. Co-authored-by: Neil Roberts <neil@linux.intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-08 08:20:31 -07:00
Timothy Arceri	763cd8c080	glsl: reduce memory footprint of uniform_storage struct The uniform will only be of a single type so store the data for opaque types in a single array. Cc: Francisco Jerez <currojerez@riseup.net> Cc: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-05 10:53:24 +11:00
Kenneth Graunke	7768b802e5	nir: Add a nir_shader_info::has_transform_feedback_varyings flag. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-04 14:00:01 -07:00
Kenneth Graunke	5d7f8cb5a5	nir: Introduce new nir_intrinsic_load_per_vertex_input intrinsics. Geometry and tessellation shaders process multiple vertices; their inputs are arrays indexed by the vertex number. While GLSL makes this look like a normal array, it can be very different behind the scenes. On Intel hardware, all inputs for a particular vertex are stored together - as if they were grouped into a single struct. This means that consecutive elements of these top-level arrays are not contiguous. In fact, they may sometimes be in completely disjoint memory segments. NIR's existing load_input intrinsics are awkward for this case, as they distill everything down to a single offset. We'd much rather keep the vertex ID separate, but build up an offset as normal beyond that. This patch introduces new nir_intrinsic_load_per_vertex_input intrinsics to handle this case. They work like ordinary load_input intrinsics, but have an extra source (src[0]) which represents the outermost array index. v2: Rebase on earlier refactors. v3: Use ssa defs instead of nir_srcs, rebase on earlier refactors. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-10-04 14:00:01 -07:00
Kenneth Graunke	f2a4b40cf1	nir/lower_io: Make get_io_offset() return a nir_ssa_def * for indirects. get_io_offset() already walks the dereference chain and discovers whether or not we have an indirect; we can just return that rather than computing it a second time via deref_has_indirect(). This means moving the call a bit earlier. By returning a nir_ssa_def *, we can pass back both an existence flag (via NULL checking the pointer) and the value in one parameter. It also simplifies the code somewhat. nir_lower_samplers works in a similar fashion. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-10-04 14:00:01 -07:00
Jason Ekstrand	050e4787d3	nir: Add a nir_foreach_variable macro This is a common enough operation that it's nice to not have to think about the arguments to foreach_list_typed every time. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-02 21:21:16 -07:00
Jason Ekstrand	7a8d06b6dd	nir: Move GS data to nir_shader_info Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-02 14:22:53 -07:00
Jason Ekstrand	e4fea486da	nir: Add a a nir_shader_info struct This commit also adds code to glsl_to_nir and prog_to_nir to fill it out. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-02 14:22:53 -07:00
Jason Ekstrand	cd1ae6ebfa	nir/glsl: Take a gl_shader_program and a stage rather than a gl_shader Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-02 14:22:53 -07:00
Chris Wilson	6b7036498a	nir: Fix uninitialized 'progress' variable in nir_lower_system_values. Commit `0a1adaf11d` (nir: Report progress from nir_lower_system_values().) introduced a bug caught by Valgrind: ==823== Conditional jump or move depends on uninitialised value(s) ==823== at 0xB09020C: convert_block (nir_lower_system_values.c:68) ==823== by 0xB079FB8: foreach_cf_node (nir.c:1310) ==823== by 0xB07A0AF: nir_foreach_block (nir.c:1336) ==823== by 0xB09026B: convert_impl (nir_lower_system_values.c:79) ... ==823== Uninitialised value was created by a stack allocation ==823== at 0xB090249: convert_impl (nir_lower_system_values.c:76) which is trivially fixed by initializing progress. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-02 10:44:28 -07:00
Connor Abbott	33da78adee	nir/remove_phis: handle trivial back-edges Some loops may have phi nodes that look like: foo = ... loop { bar = phi(foo, bar) ... } in which case we can remove the phi node and replace all uses of 'bar' with 'foo'. In particular, there are some L4D2 vertex shaders with loops that, after optimization, look like: /* succs: block_1 / loop { block block_1: / preds: block_0 block_4 / vec1 ssa_2195 = phi block_0: ssa_2136, block_4: ssa_994 vec1 ssa_7321 = phi block_0: ssa_8195, block_4: ssa_7321 vec1 ssa_7324 = phi block_0: ssa_8198, block_4: ssa_7324 vec1 ssa_7327 = phi block_0: ssa_8174, block_4: ssa_7327 vec1 ssa_8139 = intrinsic load_uniform () () (232) vec1 ssa_588 = ige ssa_2195, ssa_8139 / succs: block_2 block_3 / if ssa_588 { block block_2: / preds: block_1 / break / succs: block_5 / } else { block block_3: / preds: block_1 / / succs: block_4 / } block block_4: / preds: block_3 / vec1 ssa_994 = iadd ssa_2195, ssa_2150 / succs: block_1 */ } where after removing the second, third, and fourth phi nodes, the loop becomes entirely dead, and this patch will cause the loop to be deleted entirely. No piglit regressions. Shader-db results on bdw: instructions in affected programs: 5824 -> 5664 (-2.75%) total loops in shared programs: 2234 -> 2202 (-1.43%) helped: 32 Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2015-10-02 13:19:45 -04:00
Kenneth Graunke	39a1d36a67	nir: Allow nir_lower_io() to only lower one type of variable. We may want to use different type_size functions for (e.g.) inputs vs. uniforms. Passing in -1 for mode ignores this, handling all modes as before. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-01 10:58:30 -07:00
Jordan Justen	4810d02112	nir: Don't set dest in SSBO store glsl_to_nir conversion This matches the function signature created in lower_ubo_reference_visitor::ssbo_store which has a void return. Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2015-09-29 17:17:20 -07:00
Kenneth Graunke	476e6d732f	nir: Use a system value for gl_PrimitiveIDIn. At least on Intel hardware, gl_PrimitiveIDIn comes in as a special part of the payload rather than a normal input. This is typically what we use system values for. Dave and Ilia also agree that a system value would be nicer. At some point, we should change it at the GLSL IR level as well. But that requires changing most of the drivers. For now, let's at least make NIR do the right thing, which is easy. v2: Add a comment about not creating a temporary (suggested by Iago). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-09-29 14:19:32 -07:00
Jordan Justen	4c6ddd3397	nir: Convert SYSTEM_VALUE_NUM_WORK_GROUPS to a nir intrinsic Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-29 08:23:47 -07:00
Kenneth Graunke	02530c5dc5	nir: Add a function to count the number of vertices a GS emits. Some hardware (such as Broadwell) can run geometry shaders more efficiently when the number of vertices emitted is statically known. This pass provides a way to obtain the constant vertex count, or -1 indicating that the vertex count is unknown/non-constant. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-09-26 12:01:53 -07:00
Iago Toral Quiroga	9d5c0be5d5	nir: Implement lowered SSBO atomic intrinsics The original GLSL IR intrinsics have been lowered to an internal version that accepts a block index and an offset instead of a SSBO reference. v2 (Connor): - Document the sources used by the atomic intrinsics. Reviewed-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:23 +02:00
Iago Toral Quiroga	475d9c32d1	nir/glsl_to_nir: ignore an instruction's dest if it hasn't any Reviewed-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:22 +02:00
Iago Toral Quiroga	e59ae238b6	nir: Implement __intrinsic_load_ssbo v2: - Fix ssbo loads with boolean variables. v3: - Simplify the changes (Kristian) Reviewed-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:22 +02:00
Samuel Iglesias Gonsalvez	3e70c968de	nir: modify the instruction insertion in nir_visitor::visit(ir_call ir) This patch moves nir_instr_insert_after_cf_list call into each case in the intrinsics switch at nir_visitor::visit(ir_call ir) and define a nir_dest variable which will be used when handling ir->return_deref after the switch. This patch simplifies the code for nir_intrinsic_load_ssbo implementation changes we are going to do next. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:22 +02:00
Iago Toral Quiroga	9bb7d9ecf8	nir: Implement __intrinsic_store_ssbo v2 (Connor): - Make the STORE() macro take arguments for the extra sources (and their size) and any extra indices required. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:22 +02:00
Samuel Iglesias Gonsalvez	003ce30e36	nir: Implement ir_unop_get_buffer_size This is how backends provide the buffer size required to compute the size of unsized arrays in the previous patch Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:21 +02:00
Kenneth Graunke	542d40d698	nir: Add new GS intrinsics that maintain a count of emitted vertices. This patch also introduces a lowering pass to convert the simple GS intrinsics to the new ones. See the comments above that for the rationale behind the new intrinsics. This should be useful for i965; it's a generic enough mechanism that I could see other drivers potentially using it as well, so I don't feel too bad about putting it in the generic code. v2: - Use nir_after_block_before_jump for the cursor (caught by Jason Ekstrand - I'd mistakenly used nir_after_block when rebasing this code onto the new NIR control flow API). - Remove the old emit_vertex intrinsic at the end, rather than in the middle (requested by Jason). - Use state->... directly rather than locals (requested by Jason). - Report progress from nir_lower_gs_intrinsics() (requested by me). - Remove "Authors:" section from file comment (requested by Michael Schellenberger Costa). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-23 11:00:00 -07:00
Kenneth Graunke	0a040975ec	nir: Add unit tests for control flow graphs. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Jason Ekstrand <jason.ekstrand@intel.com> Acked-by: Connor Abbott <cwabbott0@gmail.com>	2015-09-23 11:00:00 -07:00
Kenneth Graunke	fbaa1b19d7	nir/cf: Fix dominance metadata in the dead control flow pass. The NIR control flow modification API churns the block structure, splitting blocks, stitching them back together, and so on. Preserving information about block dominance is hard (and probably not worthwhile). This patch makes nir_cf_extract() throw away all metadata, like we do when adding/removing jumps. We then make the dead control flow pass compute dominance information right before it uses it. This is necessary because earlier work by the pass may have invalidated it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-23 11:00:00 -07:00
Kenneth Graunke	6560838703	nir/cf: Fix unlink_block_successors to actually unlink the second one. Calling unlink_blocks(block, block->successors[0]) will successfully unlink the first successor, but then will shift block->successors[1] down to block->successor[0]. So the successors[1] != NULL check will always fail. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-23 11:00:00 -07:00
Kenneth Graunke	024e5ec977	nir/cf: Alter block successors before adding a fake link. Consider the case of "while (...) { break }". Or in NIR: block block_0 (0x7ab640): ... /* succs: block_1 / loop { block block_1: / preds: block_0 / break / succs: block_2 */ } block block_2: Calling nir_handle_remove_jump(block_1, nir_jump_break) will remove the break. Unfortunately, it would mangle the predecessors and successors. Here, block_2->predecessors->entries == 1, so we would create a fake link, setting block_1->successors[1] = block_2, and adding block_1 to block_2's predecessor set. This is illegal: a block cannot specify the same successor twice. In particular, adding the predecessor would have no effect, as it was already present in the set. We'd then call unlink_block_successors(), which would delete the fake link and remove block_1 from block_2's predecessor set. It would then delete successors[0], and attempt to remove block_1 from block_2's predecessor set a second time...except that it wouldn't be present, triggering an assertion failure. The fix appears to be simple: simply unlink the block's successors and recreate them to point at the correct blocks first. Then, add the fake link. In the above example, removing the break would cause block_1 to have itself as a successor (as it becomes an infinite loop), so adding the fake link won't cause a duplicate successor. v2: Add comments (requested by Connor Abbott) and fix commit message. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-23 10:59:59 -07:00
Kenneth Graunke	0991b2eb35	nir/cf: Conditionally do block_add_normal_succs() in unlink_jump(); There is a bug where we mess up predecessors/successors due to the ordering of unlinking/recreating edges/adding fake edges. In order to fix that, I need everything in one routine. However, calling block_add_normal_succs() isn't safe from cleanup_cf_node() - it would crash trying to insert phi undefs. So unfortunately I need to add a parameter. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-23 10:59:59 -07:00

1 2 3 4 5 ...

557 commits