fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-19 13:38:19 +02:00

Author	SHA1	Message	Date
Jason Ekstrand	d63162cff0	nir/repair_ssa: Replace the unreachable check with the phi builder In `a3268599f3`, I attempted to fix nir_repair_ssa for unreachable blocks. However, that commit missed the possibility that the use is in a block which, itself, is unreachable. In this case, we can end up in an infinite loop trying to replace a def with itself. Even though a no-op replacement is a fine operation, it keeps extending the end of the uses list as we're walking it. Instead of explicitly checking for the group of conditions, just check if the phi builder gives us a different def. That's guaranteed to be 100% reliable and, while it lacks symmetry with the is_valid checks, should be more reliable. Fixes: `a3268599` "nir/repair_ssa: Repair dominance for unreachable..." Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-09-23 16:19:24 +00:00
Ian Romanick	317a88b920	nir/algebraic: Additional D3D Boolean optimization I observed this pattern in several shaders in Hand of Fate 2 while investigating bugzilla #111490. This also led to the related bugzilla #111578. The shaders from HoF2 are not in shader-db. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Skylake and Ice Lake had similar results. (Ice Lake shown) total instructions in shared programs: 16222621 -> 16205419 (-0.11%) instructions in affected programs: 798418 -> 781216 (-2.15%) helped: 548 HURT: 0 helped stats (abs) min: 2 max: 158 x̄: 31.39 x̃: 35 helped stats (rel) min: 0.45% max: 28.64% x̄: 2.83% x̃: 2.09% 95% mean confidence interval for instructions value: -33.22 -29.56 95% mean confidence interval for instructions %-change: -3.11% -2.56% Instructions are helped. total cycles in shared programs: 364676209 -> 363345763 (-0.36%) cycles in affected programs: 112810504 -> 111480058 (-1.18%) helped: 546 HURT: 7 helped stats (abs) min: 2 max: 118913 x̄: 2439.77 x̃: 2340 helped stats (rel) min: 0.08% max: 37.56% x̄: 1.46% x̃: 1.08% HURT stats (abs) min: 2 max: 770 x̄: 238.00 x̃: 43 HURT stats (rel) min: 0.02% max: 11.24% x̄: 3.71% x̃: 0.35% 95% mean confidence interval for cycles value: -2884.33 -1927.41 95% mean confidence interval for cycles %-change: -1.59% -1.21% Cycles are helped. total spills in shared programs: 8870 -> 8514 (-4.01%) spills in affected programs: 1230 -> 874 (-28.94%) helped: 161 HURT: 0 total fills in shared programs: 21901 -> 21348 (-2.52%) fills in affected programs: 2120 -> 1567 (-26.08%) helped: 155 HURT: 5 Broadwell and Haswell had similar results. (Broadwell shown) total instructions in shared programs: 14994910 -> 14975495 (-0.13%) instructions in affected programs: 839033 -> 819618 (-2.31%) helped: 548 HURT: 0 helped stats (abs) min: 2 max: 299 x̄: 35.43 x̃: 49 helped stats (rel) min: 0.39% max: 19.89% x̄: 2.91% x̃: 2.22% 95% mean confidence interval for instructions value: -37.46 -33.40 95% mean confidence interval for instructions %-change: -3.12% -2.70% Instructions are helped. total cycles in shared programs: 386032453 -> 384450722 (-0.41%) cycles in affected programs: 117807357 -> 116225626 (-1.34%) helped: 547 HURT: 6 helped stats (abs) min: 2 max: 22096 x̄: 2892.01 x̃: 3926 helped stats (rel) min: 0.17% max: 10.34% x̄: 1.56% x̃: 1.31% HURT stats (abs) min: 4 max: 60 x̄: 32.83 x̃: 29 HURT stats (rel) min: 0.38% max: 12.79% x̄: 5.86% x̃: 4.65% 95% mean confidence interval for cycles value: -3060.28 -2660.27 95% mean confidence interval for cycles %-change: -1.59% -1.37% Cycles are helped. total spills in shared programs: 23372 -> 21869 (-6.43%) spills in affected programs: 11730 -> 10227 (-12.81%) helped: 352 HURT: 0 total fills in shared programs: 34747 -> 35351 (1.74%) fills in affected programs: 11013 -> 11617 (5.48%) helped: 3 HURT: 347 Ivy Bridge and Sandybridge had similar results. (Ivy Bridge shown) total instructions in shared programs: 11956420 -> 11956126 (<.01%) instructions in affected programs: 14898 -> 14604 (-1.97%) helped: 98 HURT: 0 helped stats (abs) min: 3 max: 3 x̄: 3.00 x̃: 3 helped stats (rel) min: 1.30% max: 3.57% x̄: 2.08% x̃: 2.00% 95% mean confidence interval for instructions value: -3.00 -3.00 95% mean confidence interval for instructions %-change: -2.18% -1.98% Instructions are helped. total cycles in shared programs: 178791217 -> 178790792 (<.01%) cycles in affected programs: 149763 -> 149338 (-0.28%) helped: 91 HURT: 7 helped stats (abs) min: 3 max: 107 x̄: 20.63 x̃: 16 helped stats (rel) min: 0.13% max: 6.91% x̄: 1.40% x̃: 1.18% HURT stats (abs) min: 3 max: 322 x̄: 207.43 x̃: 322 HURT stats (rel) min: 0.14% max: 19.85% x̄: 12.73% x̃: 17.41% 95% mean confidence interval for cycles value: -18.94 10.27 95% mean confidence interval for cycles %-change: -1.28% 0.49% Inconclusive result (value mean confidence interval includes 0).	2019-09-19 14:22:22 -07:00
Ian Romanick	92f70df8c3	nir/algebraic: Do not apply late DPH optimization in vertex processing stages Some shaders do not use 'invariant' in vertex and (possibly) geometry shader stages on some outputs that are intended to be invariant. For various reasons, this optimization may not be fully applied in all shaders used for different rendering passes of the same geometry. This can result in Z-fighting artifacts (at best). For now, disable this optimization in these stages. In tessellation stages applications seem to use 'precise' when necessary, so allow the optimization in those stages. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111490 Fixes: `09705747d7` ("nir/algebraic: Reassociate fadd into fmul in DPH-like pattern") All Gen8+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16194726 -> 16344745 (0.93%) instructions in affected programs: 2855172 -> 3005191 (5.25%) helped: 6 HURT: 20279 helped stats (abs) min: 1 max: 3 x̄: 1.33 x̃: 1 helped stats (rel) min: 0.44% max: 1.00% x̄: 0.54% x̃: 0.44% HURT stats (abs) min: 1 max: 32 x̄: 7.40 x̃: 7 HURT stats (rel) min: 0.14% max: 42.86% x̄: 8.58% x̃: 6.56% 95% mean confidence interval for instructions value: 7.34 7.45 95% mean confidence interval for instructions %-change: 8.48% 8.67% Instructions are HURT. total cycles in shared programs: 364471296 -> 365014683 (0.15%) cycles in affected programs: 32421530 -> 32964917 (1.68%) helped: 2925 HURT: 16144 helped stats (abs) min: 1 max: 403 x̄: 18.39 x̃: 5 helped stats (rel) min: <.01% max: 22.61% x̄: 1.97% x̃: 1.15% HURT stats (abs) min: 1 max: 18471 x̄: 36.99 x̃: 15 HURT stats (rel) min: 0.02% max: 52.58% x̄: 5.60% x̃: 3.87% 95% mean confidence interval for cycles value: 21.58 35.41 95% mean confidence interval for cycles %-change: 4.36% 4.52% Cycles are HURT.	2019-09-19 14:21:31 -07:00
Jason Ekstrand	0c4e89ad5b	Move blob from compiler/ to util/ There's nothing whatsoever compiler-specific about it other than that's currently where it's used. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-09-19 19:56:22 +00:00
Samuel Iglesias Gonsálvez	5ed5e76741	nir/algebraic: refactor inexact opcode restrictions Refactor the code to avoid calling a lot of time to auxiliary functions when it is not really needed. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-09-19 18:57:27 +02:00
Caio Marcelo de Oliveira Filho	d38e0a6326	spirv: Add missing break for capability handling New added cases "stole" the previous break. Fixes: `420ad0a1a3` ("spirv: check support for SPV_KHR_float_controls capabilities") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-18 15:49:14 -07:00
Connor Abbott	57e0bb8ccc	nir/opt_if: Fix undef handling in opt_split_alu_of_phi() The pass assumed that "Most ALU ops produce an undefined result if any source is undef" which is completely untrue. Due to how we lower if statements to selects and then optimize on those selects later, we simply cannot make that assumption. In particular this pass tried to replace an ior of undef and true, which had been generated by optimizing a select which itself came from flattening an if statement, to undef causing a miscompilation for a CTS test with radeonsi NIR. We fix this by always doing what the non-undef path did, i.e. duplicate the instruction twice. If there are cases where the instruction before the loop can be folded away due to having an undef source, we should add these to opt_undef instead. The comment above the pass says that if the phi source from before the loop is undef, and we can fold the instruction before the loop to undef, then we can ignore sources of the original instruction that don't dominate the block before the loop because we don't need them to create the instruction before the loop. This is incorrect, because the instruction at the bottom of the loop would get those sources from the wrong loop iteration. The code never actually did what the comment said, so we only have to update the comment to match what the pass actually does. We also update the example to more closely match what most actual loops look like after vtn and peephole_select. There are no shader-db changes with i965, radeonsi NIR, or radv. With anv and my vkpipeline-db there's only one change: total instructions in shared programs: 14125290 -> 14125300 (<.01%) instructions in affected programs: 2598 -> 2608 (0.38%) helped: 0 HURT: 1 total cycles in shared programs: 2051473437 -> 2051473397 (<.01%) cycles in affected programs: 36697 -> 36657 (-0.11%) helped: 1 HURT: 0 Fixes KHR-GL45.shader_subroutine.control_flow_and_returned_subroutine_values_used_as_subroutine_input with radeonsi NIR.	2019-09-18 17:18:34 -04:00
Andres Gomez	d9760f8935	nir/opcodes: Clear variable names confusion Having Python and C variables sharing name in the same block of code makes its understanding a bit confusing. Make it explicit that the Python bit_size variable refers to the destination bit size. Suggested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-18 23:59:07 +03:00
Samuel Iglesias Gonsálvez	dba4d0a319	nir: fix fmin/fmax support for doubles Until now, it was using the floating point version of fmin/fmax, instead of the double version. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	2abc6299bf	nir: fix denorm flush-to-zero in sqrt's lowering at nir_lower_double_ops v2: - Replace hard coded value with DBL_MIN (Connor). v3: - Have into account the FLOAT_CONTROLS_DENORM_PRESERVE_FP64 flag (Caio). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v2]	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	1e0e3ed15a	nir: fix denorms in unpack_half_1x16() According to VK_KHR_shader_float_controls: "Denormalized values obtained via unpacking an integer into a vector of values with smaller bit width and interpreting those values as floating-point numbers must: be flushed to zero, unless the entry point is declared with the code:DenormPreserve execution mode." v2: - Add nir_op_unpack_half_2x16_flush_to_zero opcode (Connor). v3: - Adapt to use the new NIR lowering framework (Andres). v4: - Updated to renamed shader info member and enum values (Andres). v5: - Simplify flags logic operations (Caio). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v2]	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	f097247dd8	nir/algebraic: disable inexact optimizations depending on float controls execution mode If FLOAT_CONTROLS_SIGNED_ZERO_INF_NAN_PRESERVE or FLOAT_CONTROLS_DENORM_FLUSH_TO_ZERO are enabled, do not apply the inexact optimizations so the VK_KHR_shader_float_controls execution mode is respected. v2: - Do not apply inexact optimizations if SHADER_DENORM_FLUSH_TO_ZERO is enabled (Andres). v3: - Updated to renamed shader info member (Andres). v4: - Directly access execution mode instead of dragging it by parameter (Caio). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v1]	2019-09-17 23:39:18 +03:00
Andres Gomez	3f782cdd25	nir/algebraic: mark float optimizations returning one parameter as inexact With the arrival of VK_KHR_shader_float_controls algebraic optimizations for float types of the form (('fop', a, b), a) become inexact depending on the execution mode. For example, if we have activated SHADER_DENORM_FLUSH_TO_ZERO, in case of a denorm value for the "a" parameter, we cannot return it still as a denorm, it needs to be flushed to zero. Therefore, we mark now all those operations as inexact. Suggested-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	5e22f3e29a	nir/constant_expressions: mind rounding mode converting from float to float16 destinations v2: - Move the op-code specific knowledge to nir_opcodes.py even if it means a rount trip conversion (Connor). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	ef681cf971	nir/opcodes: make sure f2f16_rtz and f2f16_rtne behavior is not overriden by the float controls execution mode Suggested-by: Connor Abbott <cwabbott0@gmail.com> Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	7580707345	nir: mind rounding mode on fadd, fsub, fmul and fma opcodes According to Vulkan spec, the new execution modes affect only correctly rounded SPIR-V instructions, which includes fadd, fsub and fmul. v2: - Fix fmul, fsub and fadd round-to-zero definitions, they should use auxiliary functions to calculate the proper value because Mesa uses round-to-nearest-even rounding mode by default (Connor). v3: - Do an actual fused multiply-add at ffma (Connor). v4: - Simplify fadd and fmul for bit sizes < 64 (Connor). - Do not use double ffma for 32 bits float (Connor). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v3]	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	0ac07c7ca7	nir: add support for round to zero rounding mode to nir_op_f2f32 f2f16's rounding modes are already handled and f2f64 don't need it as there is not a floating point type with higher bit size than 64 for now. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	f7d73db353	nir: add support for flushing to zero denorm constants v2: - Refactor conditions and shared function (Connor). - Move code to nir_eval_const_opcode() (Connor). - Don't flush to zero on fquantize2f16 From Vulkan spec, VK_KHR_shader_float_controls section: "3) Do denorm and rounding mode controls apply to OpSpecConstantOp? RESOLVED: Yes, except when the opcode is OpQuantizeToF16." v3: - Fix bit size (Connor). - Fix execution mode on nir_loop_analize (Connor). v4: - Adapt after API changes to nir_eval_const_opcode (Andres). v5: - Simplify constant_denorm_flush_to_zero (Caio). v6: - Adapt after API changes and to use the new constant constructors (Andres). - Replace MAYBE_UNUSED with UNUSED as the first is going away (Andres). v7: - Adapt to newly added calls (Andres). - Simplified the auxiliary to flush denorms to zero (Caio). - Updated to renamed supported capabilities member (Andres). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v4] Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	45668a8be1	nir: add auxiliary functions to detect if a mode is enabled v2: - Added more functions. v3: - Simplify most of the functions (Caio). v4: - Updated to renamed enum values (Andres). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v2] Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> [v3]	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	84781e1f1d	spirv/nir: keep track of SPV_KHR_float_controls execution modes v2: - Add support for rounding modes for each floating point bit size. v3: - Commit `e68871f6a4` ("spirv: Handle constants and types before execution modes") changed when the execution modes are handled, which affects the result of the floating point constants when the rounding mode is set in the execution mode. Moved the handling of the rounding modes before we handle the constants. v4: - Rename vtn_decoration "literals" to "operands" (Andres). - Simplify execution mode parsing util function (Caio). - Extend the comment about the timing of the handling of the rounding modes (Caio). v5: - Correct extension name (Caio). - Rename shader info member (Andres). - Rename float controls enum (Andres). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v3] Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	420ad0a1a3	spirv: check support for SPV_KHR_float_controls capabilities v2: - Correct extension name (Caio). - Rename supported capabilities member (Andres). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v1] Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 23:39:18 +03:00
Caio Marcelo de Oliveira Filho	9cf1bfcdd7	spirv: Handle ShaderLayer and ShaderViewportIndex capabilities SPIR-V 1.5 incorported the SPV_EXT_shader_viewport_index_layer but splitting into the two capabilities above. Just handle them as we support the extension already. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-09-16 19:18:01 -07:00
Caio Marcelo de Oliveira Filho	f6392e38d8	spirv: Update JSON and headers to 1.5 Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-09-16 19:17:26 -07:00
Sergii Romantsov	2bfcf04345	nir/large_constants: pass after lowering copy_deref v2: by J.Ekstrand suggestion moved lowering of large constants after lowering of copy_deref is done. CC: Jason Ekstrand <jason@jlekstrand.net> CC: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111450 Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>	2019-09-16 11:23:48 +00:00
Sergii Romantsov	c7b2a2fd36	nir/large_constants: more careful data copying A filed of nir_variable.location may be equel to -1. That may cause copying to invalid address of list-node, making some internal fields corrupted. Patch fixes segfault during freeing context due to corrupted address of ralloc_header.destructor. v2: copy data if var is constant (Connor Abbott) CC: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Fixes: `b6d4753568` (nir/large_constants: De-duplicate constants) Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111676 Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-16 07:58:49 +00:00
Iago Toral Quiroga	544b156968	nir/lower_point_size: assume scalar PSIZ Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-12 06:40:04 +00:00
Caio Marcelo de Oliveira Filho	83fd1e58d8	glsl/nir: Add and use a gl_nir_link() function Perform all the NIR linking steps in order. Change iris and i965 to use it. Suggested by Alejandro. v2: Add gl_nir_linker_options struct. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [v1]	2019-09-10 14:36:46 -07:00
Caio Marcelo de Oliveira Filho	664e4a610d	glsl/nir: Fill in the Parameters in NIR linker The parameter lists were not being created nor filled since i965 doesn't use them. In Gallium they are used for uniform handling, so add a way to fill them. The gl_uniform_storage struct got two new fields that let us go - from a Parameter to the matching UniformStorage and, - from the variable to the first UniformStorage without relying on names -- since they are optional for ARB_gl_spirv. Later patches will make use of them. v2: Do not fill parameters for i965. (Timothy) Use uint32_t for the new attributes. (Marek) v3: Serialize the new fields. (Timothy) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-09-10 14:36:46 -07:00
Caio Marcelo de Oliveira Filho	eda596d64b	compiler: Add glsl_contains_opaque() helper Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-09-10 14:36:46 -07:00
Caio Marcelo de Oliveira Filho	4f33f96c45	glsl/nir: Avoid overflow when setting max_uniform_location Don't use the UNMAPPED_UNIFORM_LOC (-1) to set the unsigned max_uniform_location. Those unmapped uniforms don't have to be accounted at this point. Fixes: `7a9e5cdfbb` ("nir/linker: Add gl_nir_link_uniforms()") Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-09-10 14:36:46 -07:00
Dylan Baker	26961e2cb5	glsl/tests: Handle windows \r\n new lines Currently the praser for s expressions assumes that newlines will be \n, resulting in incorrect parsing on windows, where the newline is \r\n. This patch just adds \r? to the regular expression used to parse the s expressions, which fixes at 1 test on windows. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-10 20:36:46 +00:00
Jason Ekstrand	c832820ce9	nir/dead_cf: Repair SSA if the pass makes progress The dead_cf pass calls into the CF manipulation helpers which attempt to keep NIR's SSA form sane. However, when the only break is removed from a loop, dominance gets messed up anyway because the CF SSA clean-up code only looks at phis and doesn't consider the case of code becoming unreachable. One solution to this would be to put the loop into LCSSA form before we modify any of its contents. Another (and the approach taken by this pass) is to just run the repair_ssa pass afterwards because the CF manipulation helpers are smart enough to keep all the use/def stuff sane; they just don't always preserve dominance properties. While we're here, we clean up some bogus indentation. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111405 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111069 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-06 23:39:01 +00:00
Jason Ekstrand	1005272a2b	nir/repair_ssa: Insert deref casts when needed Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-06 23:39:01 +00:00
Jason Ekstrand	a3268599f3	nir/repair_ssa: Repair dominance for unreachable blocks NIR currently assumes that unreachable blocks are trivially dominated by everything. However, when considering well-formed SSA, there is no path from any block to an unreachable block. Therefore, we can break any use-def chains where the use is in an unreachable block. This removes any dependencies on code created by uses in unreachable blocks and lets DCE do a better job of cleaning it up. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-06 23:39:01 +00:00
Jason Ekstrand	f81a2623d8	nir: Add a block_is_unreachable helper Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-06 23:39:01 +00:00
Jason Ekstrand	517142252f	nir: Don't infinitely recurse in lower_ssa_defs_to_regs_block Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-06 23:39:01 +00:00
Jason Ekstrand	37cdb7fc44	nir: Handle complex derefs in nir_split_array_vars We already bail and don't split the vars but we were passing a NULL to _mesa_hash_table_search which is not allowed. Fixes: `f1cb3348f1` "nir/split_vars: Properly bail in the presence of ..." Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-06 23:39:01 +00:00
Rhys Perry	6b8cb08756	nir/lower_io_to_vector: don't merge compact varyings Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: 02bc4aabb48 ('nir/lower_io_to_vector: allow FS outputs to be vectorized') Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-09-06 15:38:10 -07:00
Rhys Perry	bcd14756ee	nir/lower_io_to_vector: add flat mode This has lower_io_to_vector try to turn variables into arrays of 4-sized vectors when possible and fall back to the old approach when that isn't possible. This is so that lower_io_to_vector can guarantee that only one variable is used for each fragment shader output. v2: handle dual-source blending v3: don't try to merge structs and non-32-bit types in get_flat_type() v3: fix per-vertex inputs v3: fix and cleanup location advancement in get_flat_type() and it's calling code v4: prioritize the original mode over the flat mode v4: don't create flat variables to merge only one variable v5: don't skip an entire slot when encountering structs in the old mode Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-06 15:38:04 +00:00
Rhys Perry	300e758b7c	nir/lower_io_to_vector: allow FS outputs to be vectorized v2: handle dual-source blending v3: use a higher MAX_SLOTS Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-06 15:38:04 +00:00
Danylo Piliaiev	aabde02f2f	glsl: Fix unroll of do{} while(false) like loops For loops which condition is false on the first iteration iteration count was falsely calculated under the assumption that loop's condition is true until it becomes false, meaning it's true at least one time. Now such loops are reported as having 0 iteration. Similar to the fix `e71fc7f2` done in NIR. Fixes tests/shaders/glsl-fs-loop-while-false-02.shader_test Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-09-06 10:27:33 +00:00
Timur Kristóf	610cc3089c	nir: Carve out nir_lower_samplers from GLSL code. Lowering samplers is needed to produce NIR that can actually be consumed by some gallium drivers, so it doesn't make sense to to keep it only in the GLSL code. This commit introduces nir_lower_samplers to compiler/nir, while maintains the GL-specific function too. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-06 12:20:20 +03:00
Caio Marcelo de Oliveira Filho	c0c55bd84f	nir/lower_explicit_io: Handle 1 bit loads and stores Load a 32-bit value then convert to 1-bit. Convert 1-bit to 32-bit value, then Store it. These cases started to appear when we changed Anvil to use derefs for shared memory. v2: Use `bit_size` in a couple of places we were missing. (Jason) Reassign `value` instead of `src[0]`. (Jason) Fixes: `024a46a407` ("anv: use derefs for shared memory access") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-09-05 22:24:09 -07:00
Vasily Khoruzhick	9367d2ca37	nir: allow specifying filter callback in lower_alu_to_scalar Set of opcodes doesn't have enough flexibility in certain cases. E.g. Utgard PP has vector conditional select operation, but condition is always scalar. Lowering all the vector selects to scalar increases instruction number, so we need a way to filter only those ops that can't be handled in hardware. Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-06 01:51:28 +00:00
Connor Abbott	2af431cf7f	gallium: Plumb through a way to disable GLSL const lowering For radeonsi, we will prefer the NIR pass as it'll generate better code (some index calculation and a single load vs. a load, then index calculation, then another load) and oftentimes NIR optimization can kick in and make all the access indices constant. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-09-05 12:38:46 +02:00
Neil Roberts	95927c414f	glsl: Store the precision for a function return type The precision for a function return type is now stored in ir_function_signature. This will later be useful to implement mediump to float16 lowering. In the meantime it is also useful to catch errors where a function is redeclared with a different precision. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-09-04 12:41:20 +02:00
Eric Engestrom	7659c6197f	nir: fix memleak in error path Fixes: `2cf59861a8` ("nir: Add partial redundancy elimination for compares") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-09-04 00:31:53 +01:00
Rob Clark	5ccd5871ed	nir: remove unused constant_fold_state Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-03 14:10:57 -07:00
Connor Abbott	dcc64fcfed	nir: Fix num_ssbos when lowering atomic counters Otherwise it's impossible to know the maximum SSBO index for both internal TGSI shaders from TTN (which don't have any notion of atomic counters and no offset) as well as shaders from GLSL. I fixed everything I could find while grepping for num_ssbos and num_abos, which hopefully is everything (iris was the only user I could find that uses it in a meaningful way). Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-09-03 15:54:54 +02:00
Samuel Pitoiset	966a455bb9	nir: do not assume that the result of fexp2(a) is always an integral It's only correct when 'a' is an integral greater or equal to 0. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111493 Fixes: `5544b2cbbd` ("nir/algebraic: Use value range analysis to eliminate useless unary ops") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-09-02 09:00:37 +02:00

1 2 3 4 5 ...

4121 commits