fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-23 11:10:10 +01:00

Author	SHA1	Message	Date
Karol Herbst	e72beacb95	nir/loop_analyze: use nir_const_value.b for boolean results, not u32 Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-14 22:25:56 +02:00
Jason Ekstrand	10602db78c	nir/print: Use nir_src_as_int for array indices Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-04-14 22:25:56 +02:00
Jason Ekstrand	9b1e4bab6b	nir/builder: Add a nir_imm_zero helper v2: replace nir_zero_vec with nir_imm_zero (Karol Herbst) Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-04-14 22:25:56 +02:00
Karol Herbst	daaf777376	nir/builder: Move nir_imm_vec2 from blorp into the builder While we're here, fix a typo which caused it to actually return a vec4 with the third and fourth components zero. Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-14 22:25:56 +02:00
Alyssa Rosenzweig	2ce4adefa5	nir: Add nir_lower_viewport_transform On Mali hardware (supported by Panfrost and Lima), the fixed-function transformation from world-space to screen-space coordinates is done in the vertex shader prior to writing out the gl_Position varying, rather than in dedicated hardware. This commit adds a shared NIR pass for implementing coordinate transformation and lowering gl_Position writes into screen-space gl_Position writes. v2: Run directly on derefs before io/vars are lowered to cleanup the code substantially. Thank you to Qiang for this suggestion! v3: Bikeshed continues. v4: Add to Makefile.sources (per Jason's comment). Bikeshed comment. Ian and Qiang's reviews are from v3, but no real functional changes from v4. Rob's review is from v4. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Suggested-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-04-14 19:15:13 +00:00
Christian Gmeiner	b6bed115a5	nir: add lower_ftrunc Port TGSI TRUNC lowering to nir Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-13 17:54:48 +00:00
Jason Ekstrand	18ed82b084	nir: Add a pass for selectively lowering variables to scratch space This commit adds new nir_load/store_scratch opcodes which read and write a virtual scratch space. It's up to the back-end to figure out what to do with it and where to put the actual scratch data. v2: Drop const_index comments (by anholt) Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-12 15:59:31 -07:00
Eric Anholt	b88ef3bd76	nir: Add a comment about how intrinsic definitions work. I was thinking about a refactor, and needed to read this first. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-12 15:56:12 -07:00
Eric Anholt	35355b4860	nir: Drop remaining references to const_index in favor of the call to use. Please don't make me read a const_index[] expression ever again. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-12 15:56:04 -07:00
Eric Anholt	6e4d3d0a2f	nir: Drop comments about the constant_index slots for load/stores. The constant_index slots are named right there in the intrinsic definition, and the comment is just a chance to get out of sync. Noticed while reviewing the lower_to_scratch changes that copy-and-pasted wrong comments, and load_ubo and load_per_vertex_output had incorrect comments currently. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-12 15:55:55 -07:00
Karol Herbst	4a3c04a11f	glsl/nir: add support for lowering bindless images_derefs v2: handle atomics as well make use of nir_rewrite_image_intrinsic v3: remove call to nir_remove_dead_derefs v4: (Timothy Arceri) dont actually call lowering yet Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v3) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-12 09:02:59 +02:00
Timothy Arceri	035759b61b	nir/i965/freedreno/vc4: add a bindless bool to type size functions This required to calculate sizes correctly when we have bindless samplers/images. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-12 09:02:59 +02:00
Karol Herbst	3b2a9ffd60	nir: move brw_nir_rewrite_image_intrinsic into common code Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-12 09:02:59 +02:00
Timothy Arceri	9e3740c47f	nir: initialise some variables in opt_if_loop_last_continue() Fixes a couple of Coverity warnings CID 1444626. Fixes: `e30804c602` ("nir/radv: remove restrictions on opt_if_loop_last_continue()") Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-04-11 20:38:03 +10:00
Juan A. Suarez Romero	83f1b0e95b	nir/xfb: do not use bare interface type In commit `3b3653c4cf` we decided not to use bare types; hence do not use bare type when comparing with interface type to find out if the xfb variable is an array block. This fixes dEQP-VK.transform_feedback.* tests. Fixes: `3b3653c4cf` ("nir/spirv: don't use bare types, remove assert in split vars for testing") CC: Dave Airlie <airlied@redhat.com> CC: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-11 11:52:45 +02:00
Bas Nieuwenhuizen	282bacab4a	nir: Add access qualifiers on load_ubo intrinsic. Otherwise nir_lower_non_uniform_access crashes when it tries to get the access of a load_ubo. Fixes: `8ed583fe52` "spirv: Handle the NonUniformEXT decoration" Fixes: `e50ab2c0f2` "nir: Add access flags to deref and SSBO atomics" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-10 02:04:04 +02:00
Jason Ekstrand	6279074de1	nir: Get rid of global registers We have a pass to lower global registers to locals and many drivers dutifully call it. However, no one ever creates a global register ever so it's all dead code. It's time we bury it. Acked-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-09 00:29:36 -05:00
Jason Ekstrand	b28bad89b9	nir: Get rid of nir_register::is_packed All we ever do is initialize it to zero, clone it, print it, and validate it. No one ever sets or uses it. Acked-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-09 00:29:36 -05:00
Caio Marcelo de Oliveira Filho	fcbc5ccaae	nir: Don't set LOD=0 for compute shader that has derivative group When using NV_compute_shader_derivatives to set a derivative group, a compute shader supports texture with implicit LOD calculation, so don't set an explicit LOD. Note if the extension is used but the derivative group is not specified, it will default to LOD=0 as before. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-08 19:29:33 -07:00
Caio Marcelo de Oliveira Filho	d08a74d2bf	nir/algebraic: Lower CS derivatives to zero when no group defined In compute shaders if no derivative group is defined, the derivatives will always be zero. Specified in NV_compute_shader_derivatives. To make the check more convenient, add a "info" local variable to the generated code so we can refer to it in the Python rules. (Jason) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-08 19:29:32 -07:00
Timothy Arceri	e30804c602	nir/radv: remove restrictions on opt_if_loop_last_continue() When I implemented opt_if_loop_last_continue() I had restricted this pass from moving other if-statements inside the branch opposite the continue. At the time it was causing a bunch of spilling in shader-db for i965. However Samuel Pitoiset noticed that making this pass more aggressive significantly improved the performance of Doom on RADV. Below are the statistics he gathered. 28717 shaders in 14931 tests Totals: SGPRS: 1267317 -> 1267549 (0.02 %) VGPRS: 896876 -> 895920 (-0.11 %) Spilled SGPRs: 24701 -> 26367 (6.74 %) Code Size: 48379452 -> 48507880 (0.27 %) bytes Max Waves: 241159 -> 241190 (0.01 %) Totals from affected shaders: SGPRS: 23584 -> 23816 (0.98 %) VGPRS: 25908 -> 24952 (-3.69 %) Spilled SGPRs: 503 -> 2169 (331.21 %) Code Size: 2471392 -> 2599820 (5.20 %) bytes Max Waves: 586 -> 617 (5.29 %) The codesize increases is related to Wolfenstein II it seems largely due to an increase in phis rather than the existing jumps. This gives +10% FPS with Doom on my Vega56. Rhys Perry also benchmarked Doom on his VEGA64: Before: 72.53 FPS After: 80.77 FPS v2: disable pass on non-AMD drivers Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1) Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-09 11:29:41 +10:00
Jason Ekstrand	50f3535d1f	nir/search: Search for all combinations of commutative ops Consider the following search expression and NIR sequence: ('iadd', ('imul', a, b), b) ssa_2 = imul ssa_0, ssa_1 ssa_3 = iadd ssa_2, ssa_0 The current algorithm is greedy and, the moment the imul finds a match, it commits those variable names and returns success. In the above example, it maps a -> ssa_0 and b -> ssa_1. When we then try to match the iadd, it sees that ssa_0 is not b and fails to match. The iadd match will attempt to flip itself and try again (which won't work) but it cannot ask the imul to try a flipped match. This commit instead counts the number of commutative ops in each expression and assigns an index to each. It then does a loop and loops over the full combinatorial matrix of commutative operations. In order to keep things sane, we limit it to at most 4 commutative operations (16 combinations). There is only one optimization in opt_algebraic that goes over this limit and it's the bitfieldReverse detection for some UE4 demo. Shader-db results on Kaby Lake: total instructions in shared programs: 15310125 -> 15302469 (-0.05%) instructions in affected programs: 1797123 -> 1789467 (-0.43%) helped: 6751 HURT: 2264 total cycles in shared programs: 357346617 -> 357202526 (-0.04%) cycles in affected programs: 15931005 -> 15786914 (-0.90%) helped: 6024 HURT: 3436 total loops in shared programs: 4360 -> 4360 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total spills in shared programs: 23675 -> 23666 (-0.04%) spills in affected programs: 235 -> 226 (-3.83%) helped: 5 HURT: 1 total fills in shared programs: 32040 -> 32032 (-0.02%) fills in affected programs: 190 -> 182 (-4.21%) helped: 6 HURT: 2 LOST: 18 GAINED: 5 Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2019-04-08 21:38:48 +00:00
Jason Ekstrand	ad8c145658	nir/algebraic: Add some logical OR and AND patterns The new OR pattern has been seen in the wild and can end up being generated by GLSLang. Not sure about the other two new patterns but we may as well throw them in for completeness. While we're here, we can drop the '@bool' specifier from the one pattern because specifying True already implies 1-bit which basically implies boolean. Shader-db results on Kaby Lake: total instructions in shared programs: 15321227 -> 15321129 (<.01%) instructions in affected programs: 3594 -> 3496 (-2.73%) helped: 6 HURT: 0 total cycles in shared programs: 357481321 -> 357479725 (<.01%) cycles in affected programs: 44109 -> 42513 (-3.62%) helped: 6 HURT: 0 VkPipeline-DB results on Kaby Lake: total instructions in shared programs: 3770504 -> 3769734 (-0.02%) instructions in affected programs: 19058 -> 18288 (-4.04%) helped: 163 HURT: 0 total cycles in shared programs: 1417583701 -> 1417569727 (<.01%) cycles in affected programs: 750958 -> 736984 (-1.86%) helped: 158 HURT: 1 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-04-05 18:39:06 -05:00
Jason Ekstrand	03a72d96d8	nir/algebraic: Drop some @bool specifiers Now that we have one-bit booleans, we don't need to rely on looking at parent instructions in order to figure out if a value is a Boolean most of the time. We can drop these specifiers and now the optimizations will apply more generally. Shader-DB results on Kaby Lake: total instructions in shared programs: 15321168 -> 15321227 (<.01%) instructions in affected programs: 8836 -> 8895 (0.67%) helped: 1 HURT: 31 total cycles in shared programs: 357481781 -> 357481321 (<.01%) cycles in affected programs: 146524 -> 146064 (-0.31%) helped: 22 HURT: 10 total spills in shared programs: 23675 -> 23673 (<.01%) spills in affected programs: 11 -> 9 (-18.18%) helped: 1 HURT: 0 total fills in shared programs: 32040 -> 32036 (-0.01%) fills in affected programs: 27 -> 23 (-14.81%) helped: 1 HURT: 0 No change in VkPipeline-DB Looking at the instructions hurt, a bunch of them seem to be a case where doing exactly the right thing in NIR ends up doing the wrong-ish thing in the back-end because flags are dumb. In particular, there's a case where we have a MUL followed by a CMP followed by a SEL and when we turn that SEL into an OR, it uses the GRF result of the CMP rather than the flag result so the CMP can't be merged with the MUL. Those shaders appear to schedule better according to the cycle estimates so I guess it's a win? Also it helps spilling in one Car Chase compute shader. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-04-05 18:39:00 -05:00
Caio Marcelo de Oliveira Filho	c037dbb0ef	nir: Take if_uses into account when repairing SSA If a def is used as an condition before its definition, we should also consider this a case to repair. When repairing, make sure we rewrite any if conditions too. Found in while inspecting a SPIR-V conversion from a 'continue block' that contains a conditional branch. We pull the continue block up to the beggining of the loop, and the condition in the branch ends up defined afterwards. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Fixes: `364212f1ed` "nir: Add a pass to repair SSA form"	2019-04-05 09:43:46 -07:00
Samuel Pitoiset	5eb17506e1	nir: do not pack varying with different types The current algorithm only supports packing 32-bit types. If a shader uses both 16-bit and 32-bit varyings, we shouldn't compact them together. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-04-05 13:57:42 +02:00
Alyssa Rosenzweig	a83862754e	nir: Add "viewport vector" system values While a partial set of viewport system values exist, these are scalar values, which is a poor fit for viewport transformations on vector ISAs like Midgard (where the vec3 values for scale and offset each need to be coherent in a vec4 uniform slot to take advantage of vectorized transform math). This patch adds vec3 scale/offset fields corresponding to the 3D Gallium viewport / glViewport+depth Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-04 03:44:09 +00:00
Dave Airlie	eb8fefe090	nir: use proper array sizing define for vectors If we increase the vector size in the future it would be good to not have to fix these up, this should change nothing at present. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-03 13:59:06 +10:00
Timothy Arceri	d8ce915a61	Revert "nir: propagate known constant values into the if-then branch" This reverts commit `4218b6422c`. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110311	2019-04-03 13:24:18 +11:00
Timothy Arceri	4218b6422c	nir: propagate known constant values into the if-then branch Helps Max Waves / VGPR use in a bunch of Unigine Heaven shaders. shader-db results radeonsi (VEGA): Totals from affected shaders: SGPRS: 5505440 -> 5505872 (0.01 %) VGPRS: 3077520 -> 3077296 (-0.01 %) Spilled SGPRs: 39032 -> 39030 (-0.01 %) Spilled VGPRs: 16326 -> 16326 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 744 -> 744 (0.00 %) dwords per thread Code Size: 123755028 -> 123753316 (-0.00 %) bytes Compile Time: 2751028 -> 2560786 (-6.92 %) milliseconds LDS: 1415 -> 1415 (0.00 %) blocks Max Waves: 972192 -> 972240 (0.00 %) Wait states: 0 -> 0 (0.00 %) vkpipeline-db results RADV (VEGA): Totals from affected shaders: SGPRS: 160 -> 160 (0.00 %) VGPRS: 88 -> 88 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 18268 -> 18152 (-0.63 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 26 -> 26 (0.00 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-03 10:04:48 +11:00
Rob Clark	1ae0c030cb	nir: add lower_all_io_to_elements I need this part of lower_all_io_to_temps but without the actual lowering to temps part. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-30 12:56:01 -04:00
Rob Clark	e5e67228f5	nir: print var name for load_interpolated_input too Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Karol Herbst <kherbst@redhat.com>	2019-03-30 12:55:47 -04:00
Jason Ekstrand	7dbd934e26	nir: Lock around validation fail shader dumping This prevents getting mixed-up results if a multi-threaded app has two validation errors in different threads. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-03-29 21:57:51 -05:00
Karol Herbst	fea0caea2b	nir/validate: validate that tex deref sources are actually derefs Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-29 16:03:22 +01:00
Karol Herbst	6ffc72472c	nir/print: fix printing the image_array intrinsic index Fixes: `0de003be03` ("nir: Add handle/index-based image intrinsics") Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-29 16:03:22 +01:00
Brian Paul	4ee057eaef	nir: use {0} initializer instead of {} to fix MSVC build Trivial change. Fixes: `c6ee46a75` ("nir: Add nir_alu_srcs_negative_equal")	2019-03-28 20:34:23 -06:00
Ian Romanick	2cf59861a8	nir: Add partial redundancy elimination for compares This pass attempts to dectect code sequences like if (x < y) { z = y - x; ... } and replace them with sequences like t = x - y; if (t < 0) { z = -t; ... } On architectures where the subtract can generate the flags used by the if-statement, this saves an instruction. It's also possible that moving an instruction out of the if-statement will allow nir_opt_peephole_select to convert the whole thing to a bcsel. Currently only floating point compares and adds are supported. Adding support for integer will be a challenge due to integer overflow. There are a couple possible solutions, but they may not apply to all architectures. v2: Fix a typo in the commit message and a couple typos in comments. Fix possible NULL pointer deref from result of push_block(). Add missing (-A + B) case. Suggested by Caio. v3: Fix is_not_const_zero to work correctly with types other than nir_type_float32. Suggested by Ken. v4: Add some comments explaining how this works. Suggested by Ken. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-28 15:35:53 -07:00
Ian Romanick	c6ee46a753	nir: Add nir_alu_srcs_negative_equal v2: Move bug fix in get_neg_instr from the next patch to this patch (where it was intended to be in the first place). Noticed by Caio. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-28 15:35:52 -07:00
Ian Romanick	be1cc3552b	nir: Add nir_const_value_negative_equal v2: Rebase on 1-bit Boolean changes. Reviewed-by: Thomas Helland <thomashelland90@gmail.com> [v1] Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-28 15:35:52 -07:00
Ian Romanick	ae21b52e1d	nir/algebraic: Add missing 16-bit extract_[iu]8 patterns No shader-db changes on any Intel platform. v2: Use a loop to generate patterns. Suggested by Jason. v3: Fix a copy-and-paste bug in the extract_[ui] of ishl loop that would replace an extract_i8 with and extract_u8. This broke ~180 tests. This bug was introduced in v2. Reviewed-by: Matt Turner <mattst88@gmail.com> [v1] Reviewed-by: Dylan Baker <dylan@pnwbakers.com> [v2] Acked-by: Jason Ekstrand <jason@jlekstrand.net> [v2]	2019-03-28 15:35:52 -07:00
Ian Romanick	cbad201c2b	nir/algebraic: Add missing 64-bit extract_[iu]8 patterns No shader-db changes on any Intel platform. v2: Use a loop to generate patterns. Suggested by Jason. v3: Fix a copy-and-paste bug in the extract_[ui] of ishl loop that would replace an extract_i8 with and extract_u8. This broke ~180 tests. This bug was introduced in v2. Reviewed-by: Matt Turner <mattst88@gmail.com> [v1] Reviewed-by: Dylan Baker <dylan@pnwbakers.com> [v2] Acked-by: Jason Ekstrand <jason@jlekstrand.net> [v2]	2019-03-28 15:35:52 -07:00
Ian Romanick	bc17f5a2a3	nir/algebraic: Remove redundant extract_[iu]8 patterns No shader-db changes on any Intel platform. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-28 15:35:52 -07:00
Ian Romanick	c152672e68	nir/algebraic: Fix up extract_[iu]8 after loop unrolling Skylake, Broadwell, and Haswell had similar results. (Skylake shown) total instructions in shared programs: 15256840 -> 15256837 (<.01%) instructions in affected programs: 4713 -> 4710 (-0.06%) helped: 3 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.06% max: 0.08% x̄: 0.06% x̃: 0.06% total cycles in shared programs: 372286583 -> 372286583 (0.00%) cycles in affected programs: 198516 -> 198516 (0.00%) helped: 1 HURT: 1 helped stats (abs) min: 10 max: 10 x̄: 10.00 x̃: 10 helped stats (rel) min: <.01% max: <.01% x̄: <.01% x̃: <.01% HURT stats (abs) min: 10 max: 10 x̄: 10.00 x̃: 10 HURT stats (rel) min: 0.01% max: 0.01% x̄: 0.01% x̃: 0.01% No changes on any other Intel platform. v2: Use a loop to generate patterns. Suggested by Jason. v3: Fix a copy-and-paste bug in the extract_[ui] of ishl loop that would replace an extract_i8 with and extract_u8. This broke ~180 tests. This bug was introduced in v2. Reviewed-by: Matt Turner <mattst88@gmail.com> [v1] Reviewed-by: Dylan Baker <dylan@pnwbakers.com> [v2] Acked-by: Jason Ekstrand <jason@jlekstrand.net> [v2]	2019-03-28 15:35:52 -07:00
Dave Airlie	b779baa9bf	nir/deref: fix struct wrapper casts. (v3) llvm/spir-v spits out some struct a { struct b {} }, but it doesn't deref, it casts (struct a) to (struct b), reconstruct struct derefs instead of casts for these. v2: use ssa_def_rewrite uses, rework the type restrictions (Jason) v3: squish more stuff into one function, drop unused temp (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-29 08:10:50 +10:00
Samuel Pitoiset	4d0b03c83d	nir: add nir_{load,store}_deref_with_access() helpers Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: <Jason Ekstrand jason@jlekstrand.net>	2019-03-27 09:57:27 +01:00
Timothy Arceri	e76ae39ae2	nir: add support for user defined select control This will allow us to make use of the selection control support in spirv and the GL support provided by EXT_control_flow_attributes. Note this only supports if-statements as we dont support switches in NIR. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108841	2019-03-27 02:39:12 +00:00
Timothy Arceri	b56451f82c	nir: add support for user defined loop control This will allow us to make use of the loop control support in spirv and the GL support provided by EXT_control_flow_attributes. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108841	2019-03-27 02:39:12 +00:00
Jason Ekstrand	e50ab2c0f2	nir: Add access flags to deref and SSBO atomics We will need them for a new ACCESS_NON_UNIFORM flag that's about to be added in the next commit. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-25 16:12:09 -05:00
Jason Ekstrand	40074ebf74	nir: Add texture sources and intrinsics for bindless On Intel, we have both bindless and bindful and we'd like to use them at the same time if we can so we need to be able to distinguish at the NIR level between the two. This also fixes nir_lower_tex to properly handle bindless in its tex_texture_size and get_texture_lod helpers. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-25 16:12:09 -05:00
Jason Ekstrand	3bd5457641	nir: Add a lowering pass for non-uniform resource access Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-25 15:00:36 -05:00

... 15 16 17 18 19 ...

2290 commits