fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-02-26 22:10:37 +01:00

Author	SHA1	Message	Date
Eric Anholt	ce76be9933	nir: Fix some wonky whitespace in nir_search.h. Reviewed-by: Ian Romanick <ian.d.romainck@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-10-04 19:15:01 +00:00
Eric Anholt	3cc914921e	nir: Factor out most of the algebraic passes C code to .c/.h. Working on the algebraic implementation, I was being driven nuts by my editor not highlighting and handling indentation for the C code. It turns out that it's basically not pass-specific code, and we can move it over to the relevant .c file. Replaces 30KB of code with 34KB of data on my i965 build. No perf diff on shader-db (n=3) Reviewed-by: Ian Romanick <ian.d.romainck@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-10-04 19:15:01 +00:00
Eric Anholt	c23db0df18	nir: Keep the range analysis HT around intra-pass until we make a change. This lets us memoize range analysis work across instructions. Reduces runtime of shader-db on Intel by -30.0288% +/- 2.1693% (n=3). Fixes: `405de7ccb6` ("nir/range-analysis: Rudimentary value range analysis pass") Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-10-04 19:15:01 +00:00
Eric Anholt	7025dbe794	nir: Skip emitting no-op movs from the builder. Having passes generate these is just making more work for copy propagation (and thus probably calling more optimization passes) later. Noticed while trying to debug nir_opt_algebraic() top-to-bottom having O(n^2) behavior due to not finding new matches in replacement code. Reviewed-by: Ian Romanick <ian.d.romainck@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-10-04 19:15:01 +00:00
Eric Anholt	e7b754a05c	nir: Make nir_search's dumping go to stderr. Reviewed-by: Ian Romanick <ian.d.romainck@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-10-04 19:15:01 +00:00
Rhys Perry	1264acdf4b	nir/print: always use the right FILE * Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-10-04 15:24:10 +00:00
Erik Faye-Lund	49b32233a0	nir: initialize needs_helper_invocations as well Similar to the previous commit, we should also initialize needs_helper_invocations here. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-04 14:55:40 +00:00
Erik Faye-Lund	1d6d2ca9f1	nir: initialize uses_discard to false This matches what we do for uses_sample_qualifier, and what we do in ir_set_program_inouts.cpp as well. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-04 14:55:40 +00:00
Caio Marcelo de Oliveira Filho	61fa4b5707	glsl: Add helperInvocationEXT() builtin From EXT_demote_to_helper_invocation, implemented with the existing nir_intrinsic_is_helper_invocation. Such builtin is necessary when using `demote` because we can't redefine the value of gl_HelperInvocation (since it is an input variable). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-09-30 12:44:30 -07:00
Caio Marcelo de Oliveira Filho	3439956377	glsl: Parse `demote` statement When the EXT_demote_to_helper_invocation extension is enabled, `demote` is treated as a keyword, and produces an ir_demote. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-09-30 12:44:30 -07:00
Caio Marcelo de Oliveira Filho	af1a6f0f77	glsl: Add ir_demote To represent the new `demote` keyword when using EXT_demote_to_helper_invocation extension. Most of the changes are to include it in the visitors. Demote is not considered a control flow, so also include an empty visit member function in ir_control_flow_visitor. Only NIR actually supports `demote`, so assert the translations for TGSI and Mesa's gl_program -- since the demote is not expected to appear for those. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-09-30 12:44:30 -07:00
Caio Marcelo de Oliveira Filho	c81b912eb7	mesa: Extension boilerplate for EXT_demote_to_helper_invocation Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-09-30 12:44:30 -07:00
Daniel Schürmann	239423d234	nir: Remove unnecessary subtraction optimizations These optimizations are already covered after lowering. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-30 09:44:10 +00:00
Daniel Schürmann	99848a57b7	nir: recombine nir_op_*sub when lower_sub = false There are some optimizations which are only implemented for additions and some optimizations which assume that subtractions have been lowered. By lowering all subtractions first and later recombine for backends which prefer this option, we don't have to implement them twice. This patch also moves lower_negate to nir_opt_algebraic_late() to enable these optimizations for backends which make use of it. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-30 09:44:10 +00:00
Mauro Rossi	268fb10e9c	android: compiler/nir: build nir_divergence_analysis.c Prerequisite to avoid following radv linking error happening with aco FAILED: out/target/product/x86_64/obj_x86/SHARED_LIBRARIES/vulkan.radv_intermediates/LINKED/vulkan.radv.so ... external/mesa/src/amd/compiler/aco_instruction_selection_setup.cpp:178: error: undefined reference to 'nir_divergence_analysis' clang.real: error: linker command failed with exit code 1 (use -v to see invocation) Fixes: `df86c5f` ("nir: add divergence analysis pass.") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>	2019-09-28 15:56:28 +02:00
Andrii Simiklit	b32bb888c7	glsl: disallow incompatible matrices multiplication glsl 4.4 spec section '5.9 expressions': "The operator is multiply (), where both operands are matrices or one operand is a vector and the other a matrix. A right vector operand is treated as a column vector and a left vector operand as a row vector. In all these cases, it is required that the number of columns of the left operand is equal to the number of rows of the right operand. Then, the multiply () operation does a linear algebraic multiply, yielding an object that has the same number of rows as the left operand and the same number of columns as the right operand. Section 5.10 “Vector and Matrix Operations” explains in more detail how vectors and matrices are operated on." This fix disallows a multiplication of incompatible matrices like: mat4x3(..) * mat4x3(..) mat4x2(..) * mat4x2(..) mat3x2(..) * mat3x2(..) .... CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111664 Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>	2019-09-27 21:42:09 +00:00
Eric Anholt	7a4647ee39	shader_enums: Move MAX_DRAW_BUFFERS to this file. We include shader_enums.h from freedreno's compiler for both GL and Vulkan, and the main/config.h include resulted in polluting the namespace with things like MAX_VIEWPORTS that other Vulkan drivers use as their driver-specific maximums. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-09-27 13:34:28 -07:00
Ian Romanick	7e53bebcb5	nir/range-analysis: Use types to provide better ranges from bcsel and mov Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> All Gen7+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16328255 -> 16315391 (-0.08%) instructions in affected programs: 218318 -> 205454 (-5.89%) helped: 988 HURT: 0 helped stats (abs) min: 1 max: 72 x̄: 13.02 x̃: 10 helped stats (rel) min: 0.33% max: 16.04% x̄: 6.27% x̃: 4.88% 95% mean confidence interval for instructions value: -13.69 -12.35 95% mean confidence interval for instructions %-change: -6.55% -5.99% Instructions are helped. total cycles in shared programs: 363683977 -> 363615417 (-0.02%) cycles in affected programs: 1475193 -> 1406633 (-4.65%) helped: 923 HURT: 36 helped stats (abs) min: 1 max: 624 x̄: 75.78 x̃: 48 helped stats (rel) min: 0.08% max: 13.89% x̄: 5.20% x̃: 5.08% HURT stats (abs) min: 1 max: 179 x̄: 38.58 x̃: 4 HURT stats (rel) min: 0.06% max: 16.56% x̄: 3.33% x̃: 0.29% 95% mean confidence interval for cycles value: -75.88 -67.10 95% mean confidence interval for cycles %-change: -5.10% -4.66% Cycles are helped. Sandy Bridge total instructions in shared programs: 10785779 -> 10785654 (<.01%) instructions in affected programs: 13855 -> 13730 (-0.90%) helped: 67 HURT: 0 helped stats (abs) min: 1 max: 15 x̄: 1.87 x̃: 1 helped stats (rel) min: 0.20% max: 3.45% x̄: 0.97% x̃: 0.78% 95% mean confidence interval for instructions value: -2.47 -1.26 95% mean confidence interval for instructions %-change: -1.13% -0.81% Instructions are helped. total cycles in shared programs: 153704799 -> 153704481 (<.01%) cycles in affected programs: 101509 -> 101191 (-0.31%) helped: 38 HURT: 13 helped stats (abs) min: 1 max: 38 x̄: 12.53 x̃: 16 helped stats (rel) min: 0.07% max: 2.69% x̄: 0.87% x̃: 0.53% HURT stats (abs) min: 1 max: 36 x̄: 12.15 x̃: 7 HURT stats (rel) min: 0.06% max: 2.53% x̄: 0.73% x̃: 0.44% 95% mean confidence interval for cycles value: -10.24 -2.24 95% mean confidence interval for cycles %-change: -0.75% -0.17% Cycles are helped. LOST: 2 GAINED: 0 No shader-db change on Iron Lake or GM45.	2019-09-25 15:37:05 -07:00
Ian Romanick	99ddb41e2d	nir/range-analysis: Use types in the hash key This allows the reslut of mov and bcsel to be separately interpreted as float or int depending on the use. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-25 15:37:01 -07:00
Ian Romanick	018d2b524a	nir/range-analysis: Bail if the types don't match Some shaders are hurt by this change because now a load_const(0x00000000) is not recognized as eq_zero when loaded as a float. This behavior is restored in a later patch (nir/range-analysis: Use types to provide better ranges from bcsel and mov). v2: Add a comment about reinterpretation of int/uint/bool. Suggested by Caio. Rewrite condition the check for types being float versus checking for types not being all the things that aren't float. Fixes: `405de7ccb6` ("nir/range-analysis: Rudimentary value range analysis pass") Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> All Gen7+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16327543 -> 16328255 (<.01%) instructions in affected programs: 55928 -> 56640 (1.27%) helped: 0 HURT: 208 HURT stats (abs) min: 1 max: 16 x̄: 3.42 x̃: 3 HURT stats (rel) min: 0.33% max: 6.74% x̄: 1.31% x̃: 1.12% 95% mean confidence interval for instructions value: 3.06 3.79 95% mean confidence interval for instructions %-change: 1.17% 1.46% Instructions are HURT. total cycles in shared programs: 363682759 -> 363683977 (<.01%) cycles in affected programs: 325758 -> 326976 (0.37%) helped: 44 HURT: 133 helped stats (abs) min: 1 max: 179 x̄: 33.61 x̃: 5 helped stats (rel) min: 0.06% max: 14.21% x̄: 2.47% x̃: 0.29% HURT stats (abs) min: 1 max: 157 x̄: 20.28 x̃: 14 HURT stats (rel) min: 0.07% max: 14.44% x̄: 1.42% x̃: 0.73% 95% mean confidence interval for cycles value: 0.38 13.39 95% mean confidence interval for cycles %-change: -0.06% 0.96% Inconclusive result (%-change mean confidence interval includes 0). Sandy Bridge total instructions in shared programs: 10787433 -> 10787443 (<.01%) instructions in affected programs: 1842 -> 1852 (0.54%) helped: 0 HURT: 10 HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.33% max: 1.85% x̄: 0.73% x̃: 0.49% 95% mean confidence interval for instructions value: 1.00 1.00 95% mean confidence interval for instructions %-change: 0.36% 1.10% Instructions are HURT. total cycles in shared programs: 153724543 -> 153724563 (<.01%) cycles in affected programs: 8407 -> 8427 (0.24%) helped: 1 HURT: 3 helped stats (abs) min: 18 max: 18 x̄: 18.00 x̃: 18 helped stats (rel) min: 0.98% max: 0.98% x̄: 0.98% x̃: 0.98% HURT stats (abs) min: 4 max: 18 x̄: 12.67 x̃: 16 HURT stats (rel) min: 0.21% max: 0.75% x̄: 0.56% x̃: 0.72% 95% mean confidence interval for cycles value: -21.31 31.31 95% mean confidence interval for cycles %-change: -1.11% 1.46% Inconclusive result (value mean confidence interval includes 0). No shader-db changes on Iron Lake or GM45.	2019-09-25 15:37:01 -07:00
Eric Engestrom	b3e3af0e37	glsl: turn runtime asserts of compile-time value into compile-time asserts Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-09-25 21:14:52 +00:00
Connor Abbott	36e000d832	nir: Fix overlapping vars in nir_assign_io_var_locations() When handling two variables with overlapping locations, we process the one with lower location first, and then extend the location -> driver_location map to guarantee that it's contiguous for the second variable too. But the loop had the wrong bound, so we weren't extending the map 100%, which could lead to problems later such as an incorrect num_inputs. The loop index i is an index into the slots of the variable, so we need to stop at the final slot of the variable (var_size) instead of the number of unassigned slots. This fixes spec@arb_enhanced_layouts@execution@component-layout@vs-fs-array-interleave-range on radeonsi NIR. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-09-25 15:53:50 +02:00
Erik Faye-Lund	88f909eb37	glsl: correct bitcast-helpers Without this, we'll incorrectly round off huge values to the nearest representable double instead of keeping it at the exact value as we're supposed to. Found by inspecting compiler-warnings. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Fixes: `85faf5082f` ("glsl: Add 64-bit integer support for constant expressions") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-25 04:52:54 +00:00
Rhys Perry	12372d60ff	nir/opt_remove_phis: handle phis with no sources This can happen with loops with unreachable exits which are later optimized away. Fixes assertion in dEQP-VK.graphicsfuzz.unreachable-loops with RADV. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-09-25 00:58:30 +00:00
Connor Abbott	270fe55256	nir/opt_large_constants: Handle store writemasks This fixes some piglit tests on radeonsi NIR where a varying is initialized to a constant array in the vertex shader. Varying packing after nir_lower_io_to_temporaries creates writemasked stores which persist after pulling the constant initialization down into the fragment shader. While we're here, rewrite handle_constant_store() to do the loop over components outside the switch, so that we don't have to duplicate the writemask checking for every bitsize. Fixes: `1235850522` ("nir: Add a large constants optimization pass") Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-24 20:59:58 +00:00
Marek Olšák	780eeaf2f1	nir: define 8-byte size and alignment for bindless variables Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-23 15:34:22 -04:00
Marek Olšák	f5c103ce1d	nir: don't add bindless variables to num_textures and num_images It confuses radeonsi. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-23 15:34:05 -04:00
Jason Ekstrand	d63162cff0	nir/repair_ssa: Replace the unreachable check with the phi builder In `a3268599f3`, I attempted to fix nir_repair_ssa for unreachable blocks. However, that commit missed the possibility that the use is in a block which, itself, is unreachable. In this case, we can end up in an infinite loop trying to replace a def with itself. Even though a no-op replacement is a fine operation, it keeps extending the end of the uses list as we're walking it. Instead of explicitly checking for the group of conditions, just check if the phi builder gives us a different def. That's guaranteed to be 100% reliable and, while it lacks symmetry with the is_valid checks, should be more reliable. Fixes: `a3268599` "nir/repair_ssa: Repair dominance for unreachable..." Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-09-23 16:19:24 +00:00
Ian Romanick	317a88b920	nir/algebraic: Additional D3D Boolean optimization I observed this pattern in several shaders in Hand of Fate 2 while investigating bugzilla #111490. This also led to the related bugzilla #111578. The shaders from HoF2 are not in shader-db. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Skylake and Ice Lake had similar results. (Ice Lake shown) total instructions in shared programs: 16222621 -> 16205419 (-0.11%) instructions in affected programs: 798418 -> 781216 (-2.15%) helped: 548 HURT: 0 helped stats (abs) min: 2 max: 158 x̄: 31.39 x̃: 35 helped stats (rel) min: 0.45% max: 28.64% x̄: 2.83% x̃: 2.09% 95% mean confidence interval for instructions value: -33.22 -29.56 95% mean confidence interval for instructions %-change: -3.11% -2.56% Instructions are helped. total cycles in shared programs: 364676209 -> 363345763 (-0.36%) cycles in affected programs: 112810504 -> 111480058 (-1.18%) helped: 546 HURT: 7 helped stats (abs) min: 2 max: 118913 x̄: 2439.77 x̃: 2340 helped stats (rel) min: 0.08% max: 37.56% x̄: 1.46% x̃: 1.08% HURT stats (abs) min: 2 max: 770 x̄: 238.00 x̃: 43 HURT stats (rel) min: 0.02% max: 11.24% x̄: 3.71% x̃: 0.35% 95% mean confidence interval for cycles value: -2884.33 -1927.41 95% mean confidence interval for cycles %-change: -1.59% -1.21% Cycles are helped. total spills in shared programs: 8870 -> 8514 (-4.01%) spills in affected programs: 1230 -> 874 (-28.94%) helped: 161 HURT: 0 total fills in shared programs: 21901 -> 21348 (-2.52%) fills in affected programs: 2120 -> 1567 (-26.08%) helped: 155 HURT: 5 Broadwell and Haswell had similar results. (Broadwell shown) total instructions in shared programs: 14994910 -> 14975495 (-0.13%) instructions in affected programs: 839033 -> 819618 (-2.31%) helped: 548 HURT: 0 helped stats (abs) min: 2 max: 299 x̄: 35.43 x̃: 49 helped stats (rel) min: 0.39% max: 19.89% x̄: 2.91% x̃: 2.22% 95% mean confidence interval for instructions value: -37.46 -33.40 95% mean confidence interval for instructions %-change: -3.12% -2.70% Instructions are helped. total cycles in shared programs: 386032453 -> 384450722 (-0.41%) cycles in affected programs: 117807357 -> 116225626 (-1.34%) helped: 547 HURT: 6 helped stats (abs) min: 2 max: 22096 x̄: 2892.01 x̃: 3926 helped stats (rel) min: 0.17% max: 10.34% x̄: 1.56% x̃: 1.31% HURT stats (abs) min: 4 max: 60 x̄: 32.83 x̃: 29 HURT stats (rel) min: 0.38% max: 12.79% x̄: 5.86% x̃: 4.65% 95% mean confidence interval for cycles value: -3060.28 -2660.27 95% mean confidence interval for cycles %-change: -1.59% -1.37% Cycles are helped. total spills in shared programs: 23372 -> 21869 (-6.43%) spills in affected programs: 11730 -> 10227 (-12.81%) helped: 352 HURT: 0 total fills in shared programs: 34747 -> 35351 (1.74%) fills in affected programs: 11013 -> 11617 (5.48%) helped: 3 HURT: 347 Ivy Bridge and Sandybridge had similar results. (Ivy Bridge shown) total instructions in shared programs: 11956420 -> 11956126 (<.01%) instructions in affected programs: 14898 -> 14604 (-1.97%) helped: 98 HURT: 0 helped stats (abs) min: 3 max: 3 x̄: 3.00 x̃: 3 helped stats (rel) min: 1.30% max: 3.57% x̄: 2.08% x̃: 2.00% 95% mean confidence interval for instructions value: -3.00 -3.00 95% mean confidence interval for instructions %-change: -2.18% -1.98% Instructions are helped. total cycles in shared programs: 178791217 -> 178790792 (<.01%) cycles in affected programs: 149763 -> 149338 (-0.28%) helped: 91 HURT: 7 helped stats (abs) min: 3 max: 107 x̄: 20.63 x̃: 16 helped stats (rel) min: 0.13% max: 6.91% x̄: 1.40% x̃: 1.18% HURT stats (abs) min: 3 max: 322 x̄: 207.43 x̃: 322 HURT stats (rel) min: 0.14% max: 19.85% x̄: 12.73% x̃: 17.41% 95% mean confidence interval for cycles value: -18.94 10.27 95% mean confidence interval for cycles %-change: -1.28% 0.49% Inconclusive result (value mean confidence interval includes 0).	2019-09-19 14:22:22 -07:00
Ian Romanick	92f70df8c3	nir/algebraic: Do not apply late DPH optimization in vertex processing stages Some shaders do not use 'invariant' in vertex and (possibly) geometry shader stages on some outputs that are intended to be invariant. For various reasons, this optimization may not be fully applied in all shaders used for different rendering passes of the same geometry. This can result in Z-fighting artifacts (at best). For now, disable this optimization in these stages. In tessellation stages applications seem to use 'precise' when necessary, so allow the optimization in those stages. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111490 Fixes: `09705747d7` ("nir/algebraic: Reassociate fadd into fmul in DPH-like pattern") All Gen8+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16194726 -> 16344745 (0.93%) instructions in affected programs: 2855172 -> 3005191 (5.25%) helped: 6 HURT: 20279 helped stats (abs) min: 1 max: 3 x̄: 1.33 x̃: 1 helped stats (rel) min: 0.44% max: 1.00% x̄: 0.54% x̃: 0.44% HURT stats (abs) min: 1 max: 32 x̄: 7.40 x̃: 7 HURT stats (rel) min: 0.14% max: 42.86% x̄: 8.58% x̃: 6.56% 95% mean confidence interval for instructions value: 7.34 7.45 95% mean confidence interval for instructions %-change: 8.48% 8.67% Instructions are HURT. total cycles in shared programs: 364471296 -> 365014683 (0.15%) cycles in affected programs: 32421530 -> 32964917 (1.68%) helped: 2925 HURT: 16144 helped stats (abs) min: 1 max: 403 x̄: 18.39 x̃: 5 helped stats (rel) min: <.01% max: 22.61% x̄: 1.97% x̃: 1.15% HURT stats (abs) min: 1 max: 18471 x̄: 36.99 x̃: 15 HURT stats (rel) min: 0.02% max: 52.58% x̄: 5.60% x̃: 3.87% 95% mean confidence interval for cycles value: 21.58 35.41 95% mean confidence interval for cycles %-change: 4.36% 4.52% Cycles are HURT.	2019-09-19 14:21:31 -07:00
Jason Ekstrand	0c4e89ad5b	Move blob from compiler/ to util/ There's nothing whatsoever compiler-specific about it other than that's currently where it's used. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-09-19 19:56:22 +00:00
Samuel Iglesias Gonsálvez	5ed5e76741	nir/algebraic: refactor inexact opcode restrictions Refactor the code to avoid calling a lot of time to auxiliary functions when it is not really needed. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-09-19 18:57:27 +02:00
Caio Marcelo de Oliveira Filho	d38e0a6326	spirv: Add missing break for capability handling New added cases "stole" the previous break. Fixes: `420ad0a1a3` ("spirv: check support for SPV_KHR_float_controls capabilities") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-18 15:49:14 -07:00
Connor Abbott	57e0bb8ccc	nir/opt_if: Fix undef handling in opt_split_alu_of_phi() The pass assumed that "Most ALU ops produce an undefined result if any source is undef" which is completely untrue. Due to how we lower if statements to selects and then optimize on those selects later, we simply cannot make that assumption. In particular this pass tried to replace an ior of undef and true, which had been generated by optimizing a select which itself came from flattening an if statement, to undef causing a miscompilation for a CTS test with radeonsi NIR. We fix this by always doing what the non-undef path did, i.e. duplicate the instruction twice. If there are cases where the instruction before the loop can be folded away due to having an undef source, we should add these to opt_undef instead. The comment above the pass says that if the phi source from before the loop is undef, and we can fold the instruction before the loop to undef, then we can ignore sources of the original instruction that don't dominate the block before the loop because we don't need them to create the instruction before the loop. This is incorrect, because the instruction at the bottom of the loop would get those sources from the wrong loop iteration. The code never actually did what the comment said, so we only have to update the comment to match what the pass actually does. We also update the example to more closely match what most actual loops look like after vtn and peephole_select. There are no shader-db changes with i965, radeonsi NIR, or radv. With anv and my vkpipeline-db there's only one change: total instructions in shared programs: 14125290 -> 14125300 (<.01%) instructions in affected programs: 2598 -> 2608 (0.38%) helped: 0 HURT: 1 total cycles in shared programs: 2051473437 -> 2051473397 (<.01%) cycles in affected programs: 36697 -> 36657 (-0.11%) helped: 1 HURT: 0 Fixes KHR-GL45.shader_subroutine.control_flow_and_returned_subroutine_values_used_as_subroutine_input with radeonsi NIR.	2019-09-18 17:18:34 -04:00
Andres Gomez	d9760f8935	nir/opcodes: Clear variable names confusion Having Python and C variables sharing name in the same block of code makes its understanding a bit confusing. Make it explicit that the Python bit_size variable refers to the destination bit size. Suggested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-18 23:59:07 +03:00
Samuel Iglesias Gonsálvez	dba4d0a319	nir: fix fmin/fmax support for doubles Until now, it was using the floating point version of fmin/fmax, instead of the double version. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	2abc6299bf	nir: fix denorm flush-to-zero in sqrt's lowering at nir_lower_double_ops v2: - Replace hard coded value with DBL_MIN (Connor). v3: - Have into account the FLOAT_CONTROLS_DENORM_PRESERVE_FP64 flag (Caio). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v2]	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	1e0e3ed15a	nir: fix denorms in unpack_half_1x16() According to VK_KHR_shader_float_controls: "Denormalized values obtained via unpacking an integer into a vector of values with smaller bit width and interpreting those values as floating-point numbers must: be flushed to zero, unless the entry point is declared with the code:DenormPreserve execution mode." v2: - Add nir_op_unpack_half_2x16_flush_to_zero opcode (Connor). v3: - Adapt to use the new NIR lowering framework (Andres). v4: - Updated to renamed shader info member and enum values (Andres). v5: - Simplify flags logic operations (Caio). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v2]	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	f097247dd8	nir/algebraic: disable inexact optimizations depending on float controls execution mode If FLOAT_CONTROLS_SIGNED_ZERO_INF_NAN_PRESERVE or FLOAT_CONTROLS_DENORM_FLUSH_TO_ZERO are enabled, do not apply the inexact optimizations so the VK_KHR_shader_float_controls execution mode is respected. v2: - Do not apply inexact optimizations if SHADER_DENORM_FLUSH_TO_ZERO is enabled (Andres). v3: - Updated to renamed shader info member (Andres). v4: - Directly access execution mode instead of dragging it by parameter (Caio). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v1]	2019-09-17 23:39:18 +03:00
Andres Gomez	3f782cdd25	nir/algebraic: mark float optimizations returning one parameter as inexact With the arrival of VK_KHR_shader_float_controls algebraic optimizations for float types of the form (('fop', a, b), a) become inexact depending on the execution mode. For example, if we have activated SHADER_DENORM_FLUSH_TO_ZERO, in case of a denorm value for the "a" parameter, we cannot return it still as a denorm, it needs to be flushed to zero. Therefore, we mark now all those operations as inexact. Suggested-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	5e22f3e29a	nir/constant_expressions: mind rounding mode converting from float to float16 destinations v2: - Move the op-code specific knowledge to nir_opcodes.py even if it means a rount trip conversion (Connor). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	ef681cf971	nir/opcodes: make sure f2f16_rtz and f2f16_rtne behavior is not overriden by the float controls execution mode Suggested-by: Connor Abbott <cwabbott0@gmail.com> Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	7580707345	nir: mind rounding mode on fadd, fsub, fmul and fma opcodes According to Vulkan spec, the new execution modes affect only correctly rounded SPIR-V instructions, which includes fadd, fsub and fmul. v2: - Fix fmul, fsub and fadd round-to-zero definitions, they should use auxiliary functions to calculate the proper value because Mesa uses round-to-nearest-even rounding mode by default (Connor). v3: - Do an actual fused multiply-add at ffma (Connor). v4: - Simplify fadd and fmul for bit sizes < 64 (Connor). - Do not use double ffma for 32 bits float (Connor). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v3]	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	0ac07c7ca7	nir: add support for round to zero rounding mode to nir_op_f2f32 f2f16's rounding modes are already handled and f2f64 don't need it as there is not a floating point type with higher bit size than 64 for now. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	f7d73db353	nir: add support for flushing to zero denorm constants v2: - Refactor conditions and shared function (Connor). - Move code to nir_eval_const_opcode() (Connor). - Don't flush to zero on fquantize2f16 From Vulkan spec, VK_KHR_shader_float_controls section: "3) Do denorm and rounding mode controls apply to OpSpecConstantOp? RESOLVED: Yes, except when the opcode is OpQuantizeToF16." v3: - Fix bit size (Connor). - Fix execution mode on nir_loop_analize (Connor). v4: - Adapt after API changes to nir_eval_const_opcode (Andres). v5: - Simplify constant_denorm_flush_to_zero (Caio). v6: - Adapt after API changes and to use the new constant constructors (Andres). - Replace MAYBE_UNUSED with UNUSED as the first is going away (Andres). v7: - Adapt to newly added calls (Andres). - Simplified the auxiliary to flush denorms to zero (Caio). - Updated to renamed supported capabilities member (Andres). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v4] Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	45668a8be1	nir: add auxiliary functions to detect if a mode is enabled v2: - Added more functions. v3: - Simplify most of the functions (Caio). v4: - Updated to renamed enum values (Andres). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v2] Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> [v3]	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	84781e1f1d	spirv/nir: keep track of SPV_KHR_float_controls execution modes v2: - Add support for rounding modes for each floating point bit size. v3: - Commit `e68871f6a4` ("spirv: Handle constants and types before execution modes") changed when the execution modes are handled, which affects the result of the floating point constants when the rounding mode is set in the execution mode. Moved the handling of the rounding modes before we handle the constants. v4: - Rename vtn_decoration "literals" to "operands" (Andres). - Simplify execution mode parsing util function (Caio). - Extend the comment about the timing of the handling of the rounding modes (Caio). v5: - Correct extension name (Caio). - Rename shader info member (Andres). - Rename float controls enum (Andres). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v3] Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	420ad0a1a3	spirv: check support for SPV_KHR_float_controls capabilities v2: - Correct extension name (Caio). - Rename supported capabilities member (Andres). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v1] Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 23:39:18 +03:00
Caio Marcelo de Oliveira Filho	9cf1bfcdd7	spirv: Handle ShaderLayer and ShaderViewportIndex capabilities SPIR-V 1.5 incorported the SPV_EXT_shader_viewport_index_layer but splitting into the two capabilities above. Just handle them as we support the extension already. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-09-16 19:18:01 -07:00
Caio Marcelo de Oliveira Filho	f6392e38d8	spirv: Update JSON and headers to 1.5 Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-09-16 19:17:26 -07:00

1 2 3 4 5 ...

4148 commits