fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-19 04:58:08 +02:00

Author	SHA1	Message	Date
Timothy Arceri	edfcc4f022	nir: fix GCM when GVN enabled Enabling GVN uncovered a bug where we would crash if the pass thinking about pushing something into a loop. Fixes: `6538b3e566` ("nir: add heuristic for instructions in loops with GCM") Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12242>	2021-08-17 03:15:49 +00:00
Rhys Perry	cfc4433015	nir,glsl_to_nir: use nir_fdot() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8056>	2021-08-16 17:19:45 +00:00
Rhys Perry	28acc4120f	nir: lower fdot to ffma if lower_ffma=false fossil-db (GFX10.3): Totals from 57689 (39.44% of 146267) affected shaders: VGPRs: 2873712 -> 2873432 (-0.01%); split: -0.01%, +0.00% CodeSize: 227661100 -> 227583572 (-0.03%); split: -0.08%, +0.04% MaxWaves: 1289562 -> 1289598 (+0.00%); split: +0.01%, -0.00% Instrs: 43115433 -> 43083308 (-0.07%); split: -0.12%, +0.05% Latency: 869947191 -> 870279826 (+0.04%); split: -0.06%, +0.10% InvThroughput: 199425811 -> 199434448 (+0.00%); split: -0.04%, +0.05% fossil-db (GFX10): Totals from 2 (0.00% of 146267) affected shaders: Latency: 8123 -> 8107 (-0.20%) fossil-db (GFX9): Totals from 2 (0.00% of 146401) affected shaders: (no stat changes) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8056>	2021-08-16 17:19:45 +00:00
Rhys Perry	174a4f36f9	nir: create ffma from builders more often We will not be able to combine instructions into ffma later if they are exact, so create them from the start. They can be lowered later if they are unwanted. fossil-db (GFX10.3): Totals from 16589 (11.34% of 146267) affected shaders: VGPRs: 938872 -> 938704 (-0.02%) SpillSGPRs: 11334 -> 10785 (-4.84%) CodeSize: 96551964 -> 96498040 (-0.06%); split: -0.08%, +0.02% MaxWaves: 338760 -> 338772 (+0.00%) Instrs: 18356857 -> 18350486 (-0.03%); split: -0.06%, +0.02% Latency: 561563310 -> 561414360 (-0.03%); split: -0.08%, +0.05% InvThroughput: 145629673 -> 145594740 (-0.02%); split: -0.04%, +0.01% fossil-db (GFX10): Totals from 16252 (11.11% of 146267) affected shaders: VGPRs: 893820 -> 893744 (-0.01%) SpillSGPRs: 11334 -> 10785 (-4.84%) CodeSize: 95890244 -> 95839124 (-0.05%); split: -0.08%, +0.02% MaxWaves: 367704 -> 367734 (+0.01%) Instrs: 18199741 -> 18194437 (-0.03%); split: -0.06%, +0.03% Latency: 560912971 -> 560854179 (-0.01%); split: -0.07%, +0.06% InvThroughput: 142899814 -> 142877939 (-0.02%); split: -0.03%, +0.02% fossil-db (GFX9): Totals from 16287 (11.12% of 146401) affected shaders: SGPRs: 1312784 -> 1312768 (-0.00%); split: -0.05%, +0.05% VGPRs: 931440 -> 931444 (+0.00%); split: -0.00%, +0.00% SpillSGPRs: 14623 -> 14597 (-0.18%) CodeSize: 94428788 -> 94344404 (-0.09%); split: -0.10%, +0.01% MaxWaves: 90105 -> 90109 (+0.00%) Instrs: 18486905 -> 18473434 (-0.07%); split: -0.08%, +0.01% Latency: 720947295 -> 720818323 (-0.02%); split: -0.07%, +0.05% InvThroughput: 365240104 -> 365224659 (-0.00%); split: -0.02%, +0.01% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8056>	2021-08-16 17:19:45 +00:00
Rhys Perry	ed70b256ce	nir: add ffma creation helpers Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8056>	2021-08-16 17:19:45 +00:00
Rhys Perry	4ec4d862c2	nir/algebraic: add is_used_once to dot product reassociation optimization This improves register usage. fossil-db (Sienna Cichlid, on top of !9805): Totals from 4317 (2.88% of 149839) affected shaders: VGPRs: 352592 -> 351704 (-0.25%); split: -1.48%, +1.23% SpillSGPRs: 182 -> 248 (+36.26%) CodeSize: 31601192 -> 31587624 (-0.04%); split: -0.09%, +0.04% MaxWaves: 56964 -> 57298 (+0.59%); split: +2.48%, -1.90% Instrs: 5973557 -> 5974122 (+0.01%); split: -0.05%, +0.06% Latency: 72088175 -> 72253033 (+0.23%); split: -0.36%, +0.59% InvThroughput: 14978160 -> 14798919 (-1.20%); split: -1.29%, +0.09% VClause: 100994 -> 98645 (-2.33%); split: -3.05%, +0.73% SClause: 278206 -> 276820 (-0.50%); split: -0.54%, +0.04% Copies: 200264 -> 199556 (-0.35%); split: -1.17%, +0.82% Branches: 86410 -> 85930 (-0.56%); split: -0.56%, +0.01% PreSGPRs: 207355 -> 207759 (+0.19%); split: -0.00%, +0.20% PreVGPRs: 314646 -> 310911 (-1.19%); split: -1.35%, +0.17% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8056>	2021-08-16 17:19:45 +00:00
Rhys Perry	f95a16be72	nir/algebraic: reassociate add chains for more MAD/FMA-friendly code fossil-db (GFX10.3): Totals from 25866 (17.68% of 146267) affected shaders: VGPRs: 1625456 -> 1644936 (+1.20%); split: -0.05%, +1.24% SpillSGPRs: 11729 -> 11725 (-0.03%); split: -0.07%, +0.03% CodeSize: 161604460 -> 161458052 (-0.09%); split: -0.11%, +0.02% MaxWaves: 454842 -> 452160 (-0.59%); split: +0.04%, -0.63% Instrs: 30652596 -> 30456446 (-0.64%); split: -0.65%, +0.01% Latency: 723098749 -> 722084247 (-0.14%); split: -0.21%, +0.07% InvThroughput: 166023468 -> 165506875 (-0.31%); split: -0.36%, +0.05% fossil-db (GFX10): Totals from 25866 (17.68% of 146267) affected shaders: VGPRs: 1593576 -> 1611976 (+1.15%); split: -0.09%, +1.25% SpillSGPRs: 11729 -> 11725 (-0.03%); split: -0.07%, +0.03% CodeSize: 162294468 -> 162154456 (-0.09%); split: -0.11%, +0.02% MaxWaves: 477448 -> 474166 (-0.69%); split: +0.10%, -0.79% Instrs: 30820164 -> 30625805 (-0.63%); split: -0.65%, +0.02% Latency: 723190249 -> 722273445 (-0.13%); split: -0.20%, +0.08% InvThroughput: 163114872 -> 162582966 (-0.33%); split: -0.37%, +0.04% fossil-db (GFX9): Totals from 25866 (17.67% of 146401) affected shaders: SGPRs: 2167808 -> 2169920 (+0.10%); split: -0.09%, +0.19% VGPRs: 1649404 -> 1667592 (+1.10%); split: -0.43%, +1.53% CodeSize: 161273556 -> 161281996 (+0.01%); split: -0.07%, +0.08% MaxWaves: 114910 -> 113519 (-1.21%); split: +0.10%, -1.31% Instrs: 31557180 -> 31403708 (-0.49%); split: -0.50%, +0.02% Latency: 899594793 -> 898786283 (-0.09%); split: -0.19%, +0.10% InvThroughput: 412265691 -> 411551698 (-0.17%); split: -0.28%, +0.11% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8056>	2021-08-16 17:19:45 +00:00
Rhys Perry	110bcb4919	nir/algebraic: add various ffma optimizations fossil-db (GFX10.3): Totals from 7532 (5.15% of 146267) affected shaders: VGPRs: 414696 -> 414304 (-0.09%); split: -0.18%, +0.08% CodeSize: 33393444 -> 33375908 (-0.05%); split: -0.13%, +0.08% MaxWaves: 149854 -> 150094 (+0.16%); split: +0.27%, -0.11% Instrs: 6279823 -> 6271364 (-0.13%); split: -0.18%, +0.05% Latency: 60308898 -> 60296025 (-0.02%); split: -0.13%, +0.11% InvThroughput: 13770542 -> 13745192 (-0.18%); split: -0.24%, +0.06% fossil-db (GFX10): Totals from 7532 (5.15% of 146267) affected shaders: VGPRs: 406664 -> 405564 (-0.27%); split: -0.39%, +0.12% CodeSize: 33544656 -> 33527568 (-0.05%); split: -0.13%, +0.08% MaxWaves: 158584 -> 158858 (+0.17%); split: +0.30%, -0.13% Instrs: 6316242 -> 6307913 (-0.13%); split: -0.18%, +0.05% Latency: 60243290 -> 60232844 (-0.02%); split: -0.13%, +0.11% InvThroughput: 13643345 -> 13620171 (-0.17%); split: -0.24%, +0.07% fossil-db (GFX9): Totals from 7543 (5.15% of 146401) affected shaders: SGPRs: 546384 -> 547472 (+0.20%); split: -0.08%, +0.28% VGPRs: 412636 -> 411896 (-0.18%); split: -0.27%, +0.09% CodeSize: 33216196 -> 33210564 (-0.02%); split: -0.12%, +0.11% MaxWaves: 38771 -> 38789 (+0.05%); split: +0.17%, -0.12% Instrs: 6419878 -> 6414891 (-0.08%); split: -0.18%, +0.11% Latency: 70972327 -> 70922754 (-0.07%); split: -0.15%, +0.08% InvThroughput: 33949039 -> 33909258 (-0.12%); split: -0.20%, +0.08% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8056>	2021-08-16 17:19:45 +00:00
Rhys Perry	82d0600ba2	nir: swap fadd operands in nir_atan() This shouldn't do anything but will make testing a later patch easier. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8056>	2021-08-16 17:19:44 +00:00
Eric Engestrom	4d9acfa533	python: drop explicit output_encoding='utf-8' in mako templates Python 3 handles unicode strings by default, so we can drop all that. Suggested-by: Dylan Baker <dylan@pnwbakers.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3674>	2021-08-14 21:44:32 +00:00
Eric Engestrom	93cb3aca03	Revert "python: Explicitly add the 'L' suffix on Python 3" This reverts commit `ad363913e6`. This code was added to be able to compare the output file while porting the script from python2 to python3, but this has long been finished and the extra complexity is not needed anymore. Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3674>	2021-08-14 21:44:32 +00:00
Eric Engestrom	f1eae2f8bb	python: drop python2 support Signed-off-by: Eric Engestrom <eric@engestrom.ch> Acked-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3674>	2021-08-14 21:44:32 +00:00
Caio Marcelo de Oliveira Filho	0092edfec0	nir/dead_cf: Do not remove loops with loads that can't be reordered If a loop is followed by a barrier, the ordering between a load inside the loop and other memory operations after the barrier may have to be preserved depending on the type of memory involved. This is relevant when the memory is writeable by other invocations. In such case, it is not valid to completely eliminate the loop. This commit doesn't attempt to precisely catch the barrier case, as analysis could become too complex. It simply assumes it can't drop the loops that contain certain types of loads unless those are known to be safe to reorder (via the access flag). Fixes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4475 Acked-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9938>	2021-08-14 01:48:03 +00:00
Bas Nieuwenhuizen	aa8179e33f	nir/inline_functions: Handle halting functions. Without this stitch_blocks complains about ending in a jump with a non-empty block after the inserted body. I hit this with CTS raytracing tests where we tried to inline a function that basically ended up being something like { ignore_ray_intersection halt } I kept the nop path when possible as that does not leave a mess for the optimization loop to optimize. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12163>	2021-08-13 21:18:13 +00:00
Bas Nieuwenhuizen	fa6cd6e00d	nir/lower_scratch: Ensure we don't lower vars with unsupported usage. Need to avoid lowering temps when they are used by other instructions, like the rt instructions (some of the shader call parameters get converted to temp variables and we will lower them later with the explicit io lowering pass as we need to guarantee they will end up in scratch). Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12162>	2021-08-13 20:56:30 +00:00
Rhys Perry	04bd2a1245	nir: remove src/compiler/nir/nir_control_flow Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12357>	2021-08-13 17:51:42 +01:00
Emma Anholt	673cc9323a	nir: Move phi src setup to a helper. Cleans up the ralloc/list push code all over the tree. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11772>	2021-08-13 16:11:57 +00:00
Vinson Lee	8d679f4f4e	nir: Initialize evaluate_cube_face_index_amd dst.x. Fix defect reported by Coverity Scan. Uninitialized scalar variable (UNINIT) uninit_use: Using uninitialized value dst.x. Fixes: `a1a2a8dfda` ("nir: add AMD_gcn_shader extended instructions") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12290>	2021-08-12 23:13:52 -07:00
Lionel Landwerlin	01b0935d31	nir/lower_shader_calls: remove empty phis This is confusing opt_cse. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `8dfb240b1f` ("nir: Add raytracing shader call lowering pass.") Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11953>	2021-08-11 15:10:07 +03:00
Marcin Ślusarz	e1b325f587	nir/builder: invalidate metadata per function Fixes: `a62098fff2` ("nir: Add a helper for general instruction-modifying passes.") Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12324>	2021-08-11 11:23:30 +00:00
Pierre-Eric Pelloux-Prayer	7684d57a05	nir: add a pass to optimize "gl_FragDepth = gl_FragCoord.z" away gl_FragDepth default value is gl_FragCoord.z so if a shader does: gl_FragDepth = gl_FragCoord.z we can drop this assignment. v2: use nir_ssa_scalar_resolved and don't do this is gl_FragDepth is wrote multiple times (Jason) v3: - move to its own pass (Jason) - handle var = NULL (Rhys) v4: refactoring (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10697>	2021-08-11 11:00:11 +02:00
Ian Romanick	84d2e53789	Revert "nir/algebraic: Convert some f2u to f2i" Per https://gitlab.freedesktop.org/mesa/mesa/-/issues/5178#note_1019666, the assumption fundamental to this optimization is false. Section 2.4.1 (Float to Integer) of Ivy Bridge PRMs describes the situation. The wording of the section is somewhat confusing (because it doesn't clearly delineate between signed and unsigned integers), but the last two rows of the table make it clear that F->UD conversion clamps negative float values to 0. All other hardware mentioned in that thread seems to behave the same way. The real problem is that, with hardware that behaves in this ways, converting f2u(2147483648.0) to f2i(2147483648.0) changes the bit pattern that would be produced from 0x80000000 to 0x7fffffff. This reverts commit `ad05920258`. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12297>	2021-08-10 22:16:13 +00:00
Ian Romanick	3ba66ebbc8	nir/opcodes: Use u_intN_(min\|max) uadd_sat was updated using sed, so I didn't even notice the surrounding opcodes. Oops. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12297>	2021-08-10 22:16:13 +00:00
Alyssa Rosenzweig	9b57a81815	nir/lower_mediump: Fix metadata in all passes Fixes: `fb29cef8dd` ("nir: add many passes that lower and optimize 16-bit input/outputs and samplers") Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11732>	2021-08-10 20:55:33 +00:00
Alyssa Rosenzweig	03c18f7efc	nir/lower_mediump_io: Don't remap base unless needed Otherwise drivers that don't use 16-bit slots for varyings will get confused and have their driver_locations scribbled over. This has caused multiple problems for both Panfrost and Asahi this week. Given the only other user of the pass for varyings is radeonsi, which needs both together, I think this is the least controversial fix. Fixes: `fb29cef8dd` ("nir: add many passes that lower and optimize 16-bit input/outputs and samplers") Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11732>	2021-08-10 20:55:33 +00:00
Mike Blumenkrantz	ec66c58138	nir: add imm_vec3 to round these out Acked-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12253>	2021-08-09 14:45:30 +00:00
Rhys Perry	d764de6460	nir/tests: add tests for umod/imod/irem optimizations Both nir_opt_algebraic and nir_opt_idiv_const have optimizations for umod/imod/irem by constants. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12039>	2021-08-09 11:00:39 +00:00
Rhys Perry	e008eb1224	nir: fix signed overflow for iadd constant folding Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12039>	2021-08-09 11:00:39 +00:00
Rhys Perry	b627b9fcec	nir/idiv_const: optimize imod/irem fossil-db changes (Sienna Cichlid): Totals from 223 (0.15% of 150170) affected shaders: CodeSize: 384564 -> 370824 (-3.57%) Instrs: 74518 -> 71961 (-3.43%) Latency: 351620 -> 344640 (-1.99%) InvThroughput: 80122 -> 74846 (-6.58%) VClause: 919 -> 920 (+0.11%) SClause: 2879 -> 2877 (-0.07%); split: -0.10%, +0.03% Copies: 3099 -> 3103 (+0.13%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12039>	2021-08-09 11:00:39 +00:00
Rhys Perry	96168301f9	nir/idiv_const: improve idiv(n, INT_MIN) This lowering is smaller and -INT64_MIN is probably UB (signed overflow). No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12039>	2021-08-09 11:00:39 +00:00
Rhys Perry	4e2b94331b	nir/algebraic: improve irem by power-of-two optimization Requires one less instruction. No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12039>	2021-08-09 11:00:39 +00:00
Rhys Perry	2bb49e4587	nir/search: don't consider INT_MIN a negative power-of-two ineg(INT_MIN)/iabs(INT_MIN) won't work as expected. No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12039>	2021-08-09 11:00:39 +00:00
Rhys Perry	b009467b81	nir/algebraic: add optimizations for imul(a, INT_MIN) is_pos_power_of_two would catch this, but nir_op_imul has signed sources, so is_neg_power_of_two catches it instead, which creates a useless nir_op_ineg. fossil-db (Sienna Cichlid): Totals from 1014 (0.68% of 150170) affected shaders: CodeSize: 3592296 -> 3592288 (-0.00%); split: -0.00%, +0.00% Instrs: 671211 -> 670426 (-0.12%) Latency: 5268917 -> 5268479 (-0.01%); split: -0.01%, +0.00% InvThroughput: 2187349 -> 2187343 (-0.00%); split: -0.00%, +0.00% VClause: 8634 -> 8636 (+0.02%) Copies: 97585 -> 97604 (+0.02%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12039>	2021-08-09 11:00:39 +00:00
Rhys Perry	65cd5a0f22	nir/algebraic: don't optimize umod/imod/irem if lower_bitops=true Match the udiv/idiv/imul by power-of-two optimizations. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12039>	2021-08-09 11:00:39 +00:00
Rhys Perry	ec4b425f59	nir/algebraic: fix imod by negative power-of-two If "a" is a multiple of "b", then the result would have been "b" instead of 0. No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Fixes: `0ef5f3552f` ("nir: add strength reduction pattern for imod/irem with pow2 divisor.") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12039>	2021-08-09 11:00:39 +00:00
Dave Airlie	ad92c2b253	nir: add fisnormal lowering just lower the 32-bit version for now. Thanks to alyssa for this suggested lowering. Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12207>	2021-08-06 14:27:48 +10:00
Dave Airlie	330e28155f	nir: add 32-bit bool of fisfinite Add the bool lowering as well. Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12207>	2021-08-06 12:06:21 +10:00
Connor Abbott	8115cde3ba	tu, freedreno/a6xx, ir3: Rewrite tess PrimID handling The previous handling conflated RelPatchID and PrimID, which would result in incorrect gl_PrimitiveID when doing draw splitting and didn't work with PrimID passthrough which fills the VPC slot with the "correct" PrimID value from the tess factor BO which we left 0. Replace PrimID in the tess lowering pass with a new RelPatchID sysval, and relace PrimID with RelPatchID in the VS input code in turnip/freedreno at the same time so that there is no net change in the tess lowering code. However, now we have to add new mechanisms for getting the user-level PrimID: - In the TCS it comes from the VS, just like gl_PrimitiveIDIn in the GS. This means we have to add another register to our VS->TCS ABI. I decided to put PrimID in r0.z, after the TCS header and RelPatchID, because it might not be read in the TCS. - If any stage after the TCS uses PrimID, the TCS stores it in the first dword of the tess factor BO, and it is read by the fixed-function tessellator and accessed in the TES via the newly-uncovered DSPRIMID field. If we have tess and GS, the TES passes this value through to the GS in the same way as the VS does. PrimID passthrough for reading it in the FS when there's tess but no GS also "just works" once we start storing it in the TCS. In particular this fixes dEQP-VK.pipeline.misc.primitive_id_from_tess which tests exactly that. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12166>	2021-08-05 16:35:41 +00:00
Jason Ekstrand	0ddac113f8	nir: Removing uses of SSA defs destroys SSA liveness The liveness information will be a superset of real liveness so it's unlikely something will explode if it tries to use it. However, it is out-of-date and should be re-run if someone really wants it. Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12186>	2021-08-03 21:36:53 +00:00
Ian Romanick	72259a870f	util: Add and use functions to calculate min and max int for a size Many places need to know the maximum or minimum possible value for a given size integer... so everyone just open-codes their favorite version. There is some potential to hit either undefined or implementation-defined behavior, so having one version that Just Works seems beneficial. v2: Fix copy-and-pasted bug (INT64_MAX instead of INT64_MIN) in u_intmin. Noticed by CI. Lol. Rename functions `s/u_(uint\|int)(min\|max)/u_\1N_\2/g`. Suggested by Jason. Add some unit tests that would have caught the copy-and-paste bug before wasting CI time. Change the implementation of u_intN_min to use the same pattern as stdint.h. This avoids the integer division. Noticed by Jason. v3: Add changes to convert_clear_color (src/gallium/drivers/iris/iris_clear.c). Suggested by Nanley. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12177>	2021-08-03 12:55:02 -07:00
Timothy Arceri	6538b3e566	nir: add heuristic for instructions in loops with GCM Moving instructions out of large loops tends to cause excessive spilling. This appears to be a good limit. In future it might make sense to make this a NIR options so other drivers can set their own limits. Tiger Lake total instructions in shared programs: 20930180 -> 20926952 (-0.02%) instructions in affected programs: 280768 -> 277540 (-1.15%) helped: 734 HURT: 192 helped stats (abs) min: 1 max: 61 x̄: 5.16 x̃: 4 helped stats (rel) min: 0.04% max: 10.64% x̄: 3.23% x̃: 3.14% HURT stats (abs) min: 1 max: 52 x̄: 2.90 x̃: 1 HURT stats (rel) min: 0.03% max: 9.76% x̄: 1.13% x̃: 0.61% 95% mean confidence interval for instructions value: -3.89 -3.08 95% mean confidence interval for instructions %-change: -2.49% -2.16% Instructions are helped. total cycles in shared programs: 841825217 -> 838817552 (-0.36%) cycles in affected programs: 122088078 -> 119080413 (-2.46%) helped: 941 HURT: 100 helped stats (abs) min: 1 max: 160080 x̄: 3274.31 x̃: 2660 helped stats (rel) min: <.01% max: 41.64% x̄: 5.50% x̃: 4.80% HURT stats (abs) min: 1 max: 41856 x̄: 734.62 x̃: 26 HURT stats (rel) min: <.01% max: 7.29% x̄: 0.44% x̃: 0.27% 95% mean confidence interval for cycles value: -3236.56 -2541.85 95% mean confidence interval for cycles %-change: -5.26% -4.60% Cycles are helped. total sends in shared programs: 977905 -> 977782 (-0.01%) sends in affected programs: 2279 -> 2156 (-5.40%) helped: 119 HURT: 0 helped stats (abs) min: 1 max: 4 x̄: 1.03 x̃: 1 helped stats (rel) min: 0.60% max: 14.29% x̄: 6.93% x̃: 6.67% 95% mean confidence interval for sends value: -1.09 -0.98 95% mean confidence interval for sends %-change: -7.42% -6.45% Sends are helped. LOST: 2 GAINED: 0 Ice Lake total instructions in shared programs: 19865361 -> 19861747 (-0.02%) instructions in affected programs: 185789 -> 182175 (-1.95%) helped: 593 HURT: 47 helped stats (abs) min: 1 max: 27 x̄: 6.17 x̃: 4 helped stats (rel) min: 0.19% max: 8.65% x̄: 4.53% x̃: 4.60% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.03% max: 0.23% x̄: 0.11% x̃: 0.04% 95% mean confidence interval for instructions value: -5.93 -5.37 95% mean confidence interval for instructions %-change: -4.32% -4.06% Instructions are helped. total loops in shared programs: 6120 -> 6117 (-0.05%) loops in affected programs: 6 -> 3 (-50.00%) helped: 3 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00% total cycles in shared programs: 961777176 -> 959404350 (-0.25%) cycles in affected programs: 172224180 -> 169851354 (-1.38%) helped: 936 HURT: 80 helped stats (abs) min: 1 max: 9566 x̄: 2621.08 x̃: 2550 helped stats (rel) min: <.01% max: 41.77% x̄: 4.22% x̃: 3.84% HURT stats (abs) min: 1 max: 59146 x̄: 1006.34 x̃: 24 HURT stats (rel) min: <.01% max: 3.78% x̄: 0.44% x̃: 0.25% 95% mean confidence interval for cycles value: -2513.72 -2157.20 95% mean confidence interval for cycles %-change: -4.13% -3.57% Cycles are helped. total sends in shared programs: 1019995 -> 1019872 (-0.01%) sends in affected programs: 2283 -> 2160 (-5.39%) helped: 119 HURT: 0 helped stats (abs) min: 1 max: 4 x̄: 1.03 x̃: 1 helped stats (rel) min: 0.60% max: 14.29% x̄: 6.91% x̃: 6.67% 95% mean confidence interval for sends value: -1.09 -0.98 95% mean confidence interval for sends %-change: -7.39% -6.42% Sends are helped. LOST: 4 GAINED: 0 Skylake total instructions in shared programs: 17994337 -> 17993846 (<.01%) instructions in affected programs: 146294 -> 145803 (-0.34%) helped: 190 HURT: 47 helped stats (abs) min: 1 max: 12 x̄: 2.83 x̃: 3 helped stats (rel) min: 0.14% max: 4.29% x̄: 1.08% x̃: 0.90% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.03% max: 0.22% x̄: 0.11% x̃: 0.04% 95% mean confidence interval for instructions value: -2.30 -1.84 95% mean confidence interval for instructions %-change: -0.95% -0.74% Instructions are helped. total loops in shared programs: 6029 -> 6023 (-0.10%) loops in affected programs: 12 -> 6 (-50.00%) helped: 6 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00% 95% mean confidence interval for loops value: -1.00 -1.00 95% mean confidence interval for loops %-change: -50.00% -50.00% Loops are helped. total cycles in shared programs: 939062940 -> 938023548 (-0.11%) cycles in affected programs: 169671482 -> 168632090 (-0.61%) helped: 980 HURT: 134 helped stats (abs) min: 1 max: 25000 x̄: 1075.57 x̃: 1052 helped stats (rel) min: <.01% max: 42.75% x̄: 2.51% x̃: 1.32% HURT stats (abs) min: 1 max: 837 x̄: 109.45 x̃: 20 HURT stats (rel) min: <.01% max: 5.71% x̄: 0.73% x̃: 0.21% 95% mean confidence interval for cycles value: -1005.89 -860.17 95% mean confidence interval for cycles %-change: -2.39% -1.84% Cycles are helped. total sends in shared programs: 1026848 -> 1026724 (-0.01%) sends in affected programs: 2302 -> 2178 (-5.39%) helped: 120 HURT: 0 helped stats (abs) min: 1 max: 4 x̄: 1.03 x̃: 1 helped stats (rel) min: 0.60% max: 14.29% x̄: 6.91% x̃: 6.67% 95% mean confidence interval for sends value: -1.09 -0.98 95% mean confidence interval for sends %-change: -7.40% -6.43% Sends are helped. LOST: 1 GAINED: 1 Broadwell total instructions in shared programs: 17605621 -> 17605154 (<.01%) instructions in affected programs: 145691 -> 145224 (-0.32%) helped: 184 HURT: 48 helped stats (abs) min: 1 max: 12 x̄: 2.83 x̃: 3 helped stats (rel) min: 0.13% max: 4.29% x̄: 1.09% x̃: 0.93% HURT stats (abs) min: 1 max: 7 x̄: 1.12 x̃: 1 HURT stats (rel) min: 0.03% max: 0.48% x̄: 0.12% x̃: 0.04% 95% mean confidence interval for instructions value: -2.26 -1.77 95% mean confidence interval for instructions %-change: -0.95% -0.73% Instructions are helped. total loops in shared programs: 5968 -> 5963 (-0.08%) loops in affected programs: 10 -> 5 (-50.00%) helped: 5 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00% 95% mean confidence interval for loops value: -1.00 -1.00 95% mean confidence interval for loops %-change: -50.00% -50.00% Loops are helped. total cycles in shared programs: 1000679489 -> 998592756 (-0.21%) cycles in affected programs: 173421234 -> 171334501 (-1.20%) helped: 993 HURT: 153 helped stats (abs) min: 1 max: 766608 x̄: 2118.49 x̃: 1080 helped stats (rel) min: <.01% max: 54.61% x̄: 2.61% x̃: 1.73% HURT stats (abs) min: 1 max: 2200 x̄: 110.61 x̃: 11 HURT stats (rel) min: <.01% max: 5.68% x̄: 0.63% x̃: 0.06% 95% mean confidence interval for cycles value: -3191.23 -450.54 95% mean confidence interval for cycles %-change: -2.47% -1.89% Cycles are helped. total sends in shared programs: 996341 -> 996222 (-0.01%) sends in affected programs: 2151 -> 2032 (-5.53%) helped: 115 HURT: 0 helped stats (abs) min: 1 max: 4 x̄: 1.03 x̃: 1 helped stats (rel) min: 0.60% max: 14.29% x̄: 7.07% x̃: 6.67% 95% mean confidence interval for sends value: -1.09 -0.98 95% mean confidence interval for sends %-change: -7.55% -6.58% Sends are helped. Haswell total instructions in shared programs: 16038375 -> 16038121 (<.01%) instructions in affected programs: 216797 -> 216543 (-0.12%) helped: 185 HURT: 217 helped stats (abs) min: 1 max: 12 x̄: 2.84 x̃: 3 helped stats (rel) min: 0.13% max: 4.23% x̄: 1.30% x̃: 1.20% HURT stats (abs) min: 1 max: 6 x̄: 1.25 x̃: 1 HURT stats (rel) min: 0.03% max: 5.66% x̄: 0.61% x̃: 0.40% 95% mean confidence interval for instructions value: -0.85 -0.41 95% mean confidence interval for instructions %-change: -0.40% -0.14% Instructions are helped. total loops in shared programs: 5947 -> 5942 (-0.08%) loops in affected programs: 10 -> 5 (-50.00%) helped: 5 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00% 95% mean confidence interval for loops value: -1.00 -1.00 95% mean confidence interval for loops %-change: -50.00% -50.00% Loops are helped. total cycles in shared programs: 967655093 -> 965746713 (-0.20%) cycles in affected programs: 197288924 -> 195380544 (-0.97%) helped: 950 HURT: 195 helped stats (abs) min: 1 max: 782820 x̄: 2274.79 x̃: 1260 helped stats (rel) min: <.01% max: 54.26% x̄: 3.02% x̃: 1.71% HURT stats (abs) min: 1 max: 15790 x̄: 1295.73 x̃: 21 HURT stats (rel) min: <.01% max: 119.85% x̄: 7.76% x̃: 0.11% 95% mean confidence interval for cycles value: -3014.22 -319.19 95% mean confidence interval for cycles %-change: -1.83% -0.55% Cycles are helped. total sends in shared programs: 934894 -> 934765 (-0.01%) sends in affected programs: 2192 -> 2063 (-5.89%) helped: 115 HURT: 2 helped stats (abs) min: 1 max: 4 x̄: 1.14 x̃: 1 helped stats (rel) min: 0.60% max: 28.57% x̄: 7.68% x̃: 6.67% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 16.67% max: 16.67% x̄: 16.67% x̃: 16.67% 95% mean confidence interval for sends value: -1.23 -0.98 95% mean confidence interval for sends %-change: -8.28% -6.24% Sends are helped. LOST: 1 GAINED: 18 Ivy Bridge total instructions in shared programs: 15269357 -> 15269398 (<.01%) instructions in affected programs: 190484 -> 190525 (0.02%) helped: 77 HURT: 206 helped stats (abs) min: 1 max: 6 x̄: 2.47 x̃: 3 helped stats (rel) min: 0.14% max: 5.31% x̄: 1.46% x̃: 1.65% HURT stats (abs) min: 1 max: 3 x̄: 1.12 x̃: 1 HURT stats (rel) min: 0.03% max: 2.38% x̄: 0.42% x̃: 0.40% 95% mean confidence interval for instructions value: -0.06 0.35 95% mean confidence interval for instructions %-change: -0.21% 0.03% Inconclusive result (value mean confidence interval includes 0). total loops in shared programs: 4001 -> 3996 (-0.12%) loops in affected programs: 10 -> 5 (-50.00%) helped: 5 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00% 95% mean confidence interval for loops value: -1.00 -1.00 95% mean confidence interval for loops %-change: -50.00% -50.00% Loops are helped. total cycles in shared programs: 562045564 -> 561063543 (-0.17%) cycles in affected programs: 200924872 -> 199942851 (-0.49%) helped: 748 HURT: 160 helped stats (abs) min: 2 max: 14926 x̄: 1692.94 x̃: 1620 helped stats (rel) min: <.01% max: 53.29% x̄: 3.17% x̃: 1.87% HURT stats (abs) min: 2 max: 15726 x̄: 1776.86 x̃: 36 HURT stats (rel) min: <.01% max: 114.43% x̄: 10.66% x̃: 0.21% 95% mean confidence interval for cycles value: -1237.33 -925.71 95% mean confidence interval for cycles %-change: -1.54% 0.08% Inconclusive result (%-change mean confidence interval includes 0). total sends in shared programs: 893348 -> 893330 (<.01%) sends in affected programs: 187 -> 169 (-9.63%) helped: 14 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.29 x̃: 1 helped stats (rel) min: 4.08% max: 22.22% x̄: 11.70% x̃: 10.10% 95% mean confidence interval for sends value: -1.56 -1.02 95% mean confidence interval for sends %-change: -14.92% -8.48% Sends are helped. LOST: 1 GAINED: 19 Sandy Bridge total instructions in shared programs: 11785227 -> 11785774 (<.01%) instructions in affected programs: 78403 -> 78950 (0.70%) helped: 65 HURT: 505 helped stats (abs) min: 1 max: 4 x̄: 2.22 x̃: 3 helped stats (rel) min: 0.14% max: 4.17% x̄: 1.19% x̃: 1.38% HURT stats (abs) min: 1 max: 5 x̄: 1.37 x̃: 1 HURT stats (rel) min: 0.24% max: 3.33% x̄: 1.57% x̃: 1.72% 95% mean confidence interval for instructions value: 0.85 1.07 95% mean confidence interval for instructions %-change: 1.16% 1.36% Instructions are HURT. total loops in shared programs: 2441 -> 2437 (-0.16%) loops in affected programs: 8 -> 4 (-50.00%) helped: 4 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00% 95% mean confidence interval for loops value: -1.00 -1.00 95% mean confidence interval for loops %-change: -50.00% -50.00% Loops are helped. total cycles in shared programs: 497178796 -> 496669298 (-0.10%) cycles in affected programs: 51483322 -> 50973824 (-0.99%) helped: 476 HURT: 137 helped stats (abs) min: 2 max: 7502 x̄: 1079.36 x̃: 1260 helped stats (rel) min: <.01% max: 42.50% x̄: 2.31% x̃: 0.86% HURT stats (abs) min: 2 max: 754 x̄: 31.23 x̃: 18 HURT stats (rel) min: <.01% max: 3.01% x̄: 0.09% x̃: 0.02% 95% mean confidence interval for cycles value: -901.99 -760.32 95% mean confidence interval for cycles %-change: -2.20% -1.36% Cycles are helped. total sends in shared programs: 642919 -> 642915 (<.01%) sends in affected programs: 32 -> 28 (-12.50%) helped: 4 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 11.11% max: 14.29% x̄: 12.70% x̃: 12.70% 95% mean confidence interval for sends value: -1.00 -1.00 95% mean confidence interval for sends %-change: -15.61% -9.78% Sends are helped. Iron Lake total instructions in shared programs: 8180061 -> 8180248 (<.01%) instructions in affected programs: 65004 -> 65191 (0.29%) helped: 59 HURT: 253 helped stats (abs) min: 1 max: 4 x̄: 2.24 x̃: 3 helped stats (rel) min: 0.16% max: 2.23% x̄: 1.04% x̃: 1.29% HURT stats (abs) min: 1 max: 5 x̄: 1.26 x̃: 1 HURT stats (rel) min: 0.21% max: 3.85% x̄: 0.93% x̃: 0.60% 95% mean confidence interval for instructions value: 0.43 0.77 95% mean confidence interval for instructions %-change: 0.45% 0.68% Instructions are HURT. total loops in shared programs: 863 -> 861 (-0.23%) loops in affected programs: 4 -> 2 (-50.00%) helped: 2 HURT: 0 total cycles in shared programs: 239357490 -> 238907668 (-0.19%) cycles in affected programs: 17314006 -> 16864184 (-2.60%) helped: 176 HURT: 34 helped stats (abs) min: 4 max: 13400 x̄: 2558.05 x̃: 2920 helped stats (rel) min: 0.01% max: 35.58% x̄: 3.76% x̃: 2.69% HURT stats (abs) min: 2 max: 14 x̄: 11.59 x̃: 14 HURT stats (rel) min: <.01% max: 0.06% x̄: 0.03% x̃: 0.03% 95% mean confidence interval for cycles value: -2440.68 -1843.34 95% mean confidence interval for cycles %-change: -3.78% -2.51% Cycles are helped. GM45 total instructions in shared programs: 4985293 -> 4985401 (<.01%) instructions in affected programs: 58807 -> 58915 (0.18%) helped: 57 HURT: 202 helped stats (abs) min: 1 max: 4 x̄: 2.26 x̃: 3 helped stats (rel) min: 0.15% max: 2.23% x̄: 1.06% x̃: 1.29% HURT stats (abs) min: 1 max: 5 x̄: 1.17 x̃: 1 HURT stats (rel) min: 0.21% max: 3.85% x̄: 0.76% x̃: 0.48% 95% mean confidence interval for instructions value: 0.22 0.61 95% mean confidence interval for instructions %-change: 0.24% 0.48% Instructions are HURT. total loops in shared programs: 639 -> 638 (-0.16%) loops in affected programs: 2 -> 1 (-50.00%) helped: 1 HURT: 0 total cycles in shared programs: 153794236 -> 153546274 (-0.16%) cycles in affected programs: 9947778 -> 9699816 (-2.49%) helped: 110 HURT: 31 helped stats (abs) min: 4 max: 13400 x̄: 2257.51 x̃: 1796 helped stats (rel) min: 0.01% max: 35.58% x̄: 4.33% x̃: 2.45% HURT stats (abs) min: 2 max: 14 x̄: 11.74 x̃: 14 HURT stats (rel) min: <.01% max: 0.06% x̄: 0.03% x̃: 0.03% 95% mean confidence interval for cycles value: -2113.77 -1403.42 95% mean confidence interval for cycles %-change: -4.27% -2.47% Cycles are helped. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2899 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12064>	2021-08-03 10:54:50 +00:00
Timothy Arceri	a7f2e683de	nir: move nir_block_ends_in_break() to nir.h Will be used in a following commit. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12064>	2021-08-03 10:54:50 +00:00
Timothy Arceri	a9ed4538ab	nir: add indirect loop unrolling to compiler options This is where it should be rather than having to pass it into the optimisation pass every time. It also allows us to call the loop analysis pass without having to duplicate these options which we will do later in this series. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12064>	2021-08-03 10:54:50 +00:00
Timur Kristóf	da9f4b2e67	nir, aco: Remove vertex and primitive count overwrite intrinsic. It's no longer needed. No Fossil DB changes. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11908>	2021-08-02 11:38:25 +00:00
Timur Kristóf	1bbea90f50	aco, nir, ac: Simplify sequence of getting initial NGG VS edge flags. Instead of v_bfe + v_lshl_or for each vertex, get all 3 edge flags at once of every vertex. This takes fewer VALU instructions than previously. Fossil DB results on Sienna Cichlid (with NGGC on): Totals from 56917 (44.24% of 128647) affected shaders: CodeSize: 161028288 -> 158751628 (-1.41%) Instrs: 30917985 -> 30519571 (-1.29%) Latency: 130617204 -> 129975532 (-0.49%); split: -0.50%, +0.01% InvThroughput: 21280238 -> 20927401 (-1.66%) Copies: 3011120 -> 3011125 (+0.00%); split: -0.00%, +0.00% No Fossil DB changed with NGGC off. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11908>	2021-08-02 11:38:25 +00:00
Emma Anholt	9ffd00bcf1	nir_to_tgsi: Pack our tex coords into vec4 nir_tex_src_backend[12]. For TGSI, we need the coordinate, comparator, bias, and LOD all together in the first two vec4 args, and by doing it in the backend we were generating extra MOVs. softpipe shader-db results: total instructions in shared programs: 2985416 -> 2953625 (-1.06%) instructions in affected programs: 499937 -> 468146 (-6.36%) total temps in shared programs: 544769 -> 565869 (3.87%) temps in affected programs: 105469 -> 126569 (20.01%) i915g shader-db: total instructions in shared programs: 371625 -> 369594 (-0.55%) instructions in affected programs: 24903 -> 22872 (-8.16%) total tex_indirect in shared programs: 11381 -> 11365 (-0.14%) tex_indirect in affected programs: 43 -> 27 (-37.21%) LOST: 7 GAINED: 16 The temps increase is the pre-existing issue that we never release temps for NIR regs, which doesn't matter much for softpipe (just memory/cache footprint) but does for i915g as seen by shaders that no longer compile (though overall we seem to win). Reviewed-by: Adam Jackson <ajax@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11912>	2021-07-29 09:05:05 -07:00
Enrico Galli	16ef26ffcb	nir_lower_readonly_images_to_tex: Fix typeo on image arrays Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12119>	2021-07-29 01:44:45 +00:00
Lionel Landwerlin	7e3bad0f8e	nir/lower_shader_calls: adding missing stack offset alignment Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `8dfb240b1f` ("nir: Add raytracing shader call lowering pass.") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12112>	2021-07-28 23:04:21 +00:00
Daniel Schürmann	bc500da67d	nir/shrink_vectors: shrink vecN properly This patch allows to shrink vecN instructions where one or more components at any position are unused. Stat changes for softpipe: total instructions in shared programs: 2986101 -> 2985416 (-0.02%) instructions in affected programs: 51216 -> 50531 (-1.34%) Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11411>	2021-07-26 09:24:37 +00:00
Daniel Schürmann	36fe7398c0	nir/shrink_vectors: shrink ALU properly ALU instructions of which not all components are read, can be shrunk to the number of read components. Previously, this would only remove trailing components. This patch enables to remove components from any position. Stat changes for softpipe: total instructions in shared programs: 3001291 -> 2984698 (-0.55%) instructions in affected programs: 225585 -> 208992 (-7.36%) total loops in shared programs: 1389 -> 1358 (-2.23%) loops in affected programs: 36 -> 5 (-86.11%) Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11411>	2021-07-26 09:24:37 +00:00

... 7 8 9 10 11 ...

3670 commits