fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-17 18:18:06 +02:00

Author	SHA1	Message	Date
Daniel Schürmann	5f79e4e69a	nir/algebraic: fold some nested bcsel Totals from 14266 (10.62% of 134368) affected shaders (Polaris): SGPRs: 761756 -> 762732 (+0.13%); split: -0.00%, +0.13% VGPRs: 430392 -> 430924 (+0.12%); split: -0.05%, +0.17% SpillSGPRs: 4652 -> 4628 (-0.52%); split: -0.60%, +0.09% CodeSize: 30133000 -> 29949780 (-0.61%); split: -0.66%, +0.05% MaxWaves: 102122 -> 102111 (-0.01%); split: +0.00%, -0.01% Instrs: 5845085 -> 5841668 (-0.06%); split: -0.08%, +0.03% Cycles: 69033140 -> 68889188 (-0.21%); split: -0.22%, +0.01% VMEM: 8479021 -> 8474978 (-0.05%); split: +0.03%, -0.08% SMEM: 831437 -> 830464 (-0.12%); split: +0.06%, -0.18% VClause: 105411 -> 105410 (-0.00%); split: -0.01%, +0.01% SClause: 327727 -> 327780 (+0.02%); split: -0.00%, +0.02% Copies: 372704 -> 373306 (+0.16%); split: -0.16%, +0.32% Branches: 112260 -> 112269 (+0.01%); split: -0.00%, +0.01% PreSGPRs: 433308 -> 433631 (+0.07%); split: -0.01%, +0.09% PreVGPRs: 397888 -> 397905 (+0.00%); split: -0.01%, +0.01% Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>	2020-07-20 15:56:46 +00:00
Daniel Schürmann	27244662f2	nir/algebraic: propagate b2i out of ior/iand Totals from 761 (0.57% of 134368) affected shaders (Polaris): SGPRs: 29496 -> 29488 (-0.03%) SpillSGPRs: 41 -> 43 (+4.88%) CodeSize: 1922036 -> `1882408` (-2.06%); split: -2.08%, +0.02% Instrs: 366051 -> 360362 (-1.55%); split: -1.57%, +0.02% Cycles: 7692516 -> 7661216 (-0.41%); split: -0.41%, +0.01% VMEM: 365175 -> 365172 (-0.00%) VClause: 15324 -> 15322 (-0.01%) SClause: 9825 -> 9824 (-0.01%); split: -0.02%, +0.01% Copies: 41216 -> 41294 (+0.19%); split: -0.01%, +0.20% Branches: 7020 -> 7033 (+0.19%) PreSGPRs: 22103 -> 22106 (+0.01%) PreVGPRs: 26518 -> 26515 (-0.01%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>	2020-07-20 15:56:46 +00:00
Daniel Schürmann	baee5a9812	nir/algebraic: add distributive rules for ior/iand Totals from 581 (0.43% of 134368) affected shaders (Polaris): CodeSize: 1389560 -> 1386488 (-0.22%) Instrs: 264488 -> 263984 (-0.19%) Cycles: 1057952 -> 1055936 (-0.19%) VMEM: 296016 -> 291613 (-1.49%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>	2020-07-20 15:56:46 +00:00
Daniel Schürmann	70d3efeb88	nir/algebraic: optimize (a < 0.0) ? -a : a -> fabs(a) Totals from affected shaders: (VEGA) SGPRS: 13920 -> 13920 (0.00 %) VGPRS: 10252 -> 10252 (0.00 %) Spilled SGPRs: 62 -> 62 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 587648 -> 587224 (-0.07 %) bytes LDS: 5 -> 5 (0.00 %) blocks Max Waves: 1489 -> 1489 (0.00 %) Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>	2020-07-20 15:56:46 +00:00
Daniel Schürmann	9d22c5ed71	nir/algebraic: optimize fmul(x, bcsel(c, -1.0, 1.0)) -> bcsel(c, -x, x) Totals from affected shaders: (VEGA) SGPRS: 545712 -> 545712 (0.00 %) VGPRS: 413092 -> 413116 (0.01 %) Spilled SGPRs: 10616 -> 10616 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 37031684 -> 36984248 (-0.13 %) bytes LDS: 427 -> 427 (0.00 %) blocks Max Waves: 54350 -> 54340 (-0.02 %) Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>	2020-07-20 15:56:46 +00:00
Daniel Schürmann	56ec814b56	nir/algebraic: add some more unop + bcsel optimizations Totals from affected shaders: (VEGA) SGPRS: 284392 -> 284400 (0.00 %) VGPRS: 261080 -> 261076 (-0.00 %) Spilled SGPRs: 105 -> 105 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 24698596 -> 24277788 (-1.70 %) bytes LDS: 196 -> 196 (0.00 %) blocks Max Waves: 10101 -> 10105 (0.04 %) Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>	2020-07-20 15:56:45 +00:00
Daniel Schürmann	2fca183910	nir/algebraic: add optimizations for fsign/isign This just reverts fsign/isign lowering. Totals from affected shaders: SGPRS: 257496 -> 256672 (-0.32 %) VGPRS: 181800 -> 178864 (-1.61 %) Spilled SGPRs: 105 -> 105 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 11355852 -> 11141840 (-1.88 %) bytes LDS: 3789 -> 3789 (0.00 %) blocks Max Waves: 30453 -> 30951 (1.64 %) Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>	2020-07-20 15:56:45 +00:00
Daniel Schürmann	8e1b75b330	nir/algebraic: optimize iand/ior of (n)eq zero Found in some Detroit: Become Human shaders. Totals from affected shaders: SGPRS: 700256 -> 700256 (0.00 %) VGPRS: 507208 -> 507212 (0.00 %) Spilled SGPRs: 142531 -> 142531 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 76404616 -> 76301768 (-0.13 %) bytes LDS: 43 -> 43 (0.00 %) blocks Max Waves: 21438 -> 21438 (0.00 %) Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>	2020-07-20 15:56:45 +00:00
Daniel Schürmann	e4281dbecc	nir: also move b2i in case of nir_move_copies Booleans are often more efficient with register usage. This also allows to move comparisons further. Totals from affected shaders: (VEGA) SGPRS: 451608 -> 450320 (-0.29 %) VGPRS: 351448 -> 351256 (-0.05 %) Spilled SGPRs: 105 -> 105 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 1008 -> 1008 (0.00 %) dwords per thread Code Size: 26555596 -> 26551080 (-0.02 %) bytes LDS: 10323 -> 10323 (0.00 %) blocks Max Waves: 42850 -> 42934 (0.20 %) Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>	2020-07-20 15:56:45 +00:00
Daniel Schürmann	de0ebaf09d	nir/algebraic: optimize bcsel(a, 0, 1) to b2i This avoids combination with other bcsel operations, and as b2i is often a no-op (when used for iadd and such), the resulting pattern is preferable. Totals from affected shaders: (VEGA) SGPRS: 598448 -> 598448 (0.00 %) VGPRS: 457940 -> 457352 (-0.13 %) Spilled SGPRs: 127154 -> 127154 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 64836352 -> 64802728 (-0.05 %) bytes LDS: 781 -> 781 (0.00 %) blocks Max Waves: 22931 -> 22931 (0.00 %) Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>	2020-07-20 15:56:45 +00:00
Icecream95	2aa507f87a	nir: Set the alignment for SSBO lowering The alignment can just be copied from the source intrinsic. Fixes the assertion nir_intrinsic_align_offset(instr) < nir_intrinsic_align_mul(instr) Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5949>	2020-07-17 18:03:50 +00:00
Icecream95	4e986568b8	nir: Fix lower_two_sided_color when the face is an input Fixes the two-sided-lighting and vertex-program-two-side piglit tests on Panfrost. v2: Use an existing variable for gl_FrontFacing if present. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Rob Clark <robdclark@chromium.org> Tested-by: Urja Rannikko <urjaman@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5915>	2020-07-17 14:50:26 +00:00
Icecream95	314ba5e174	nir: Add a face_sysval argument to nir_lower_two_sided_color This is needed for handling drivers that use an input for loading the face, for example Panfrost with Midgard GPUs. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Rob Clark <robdclark@chromium.org> Tested-by: Urja Rannikko <urjaman@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5915>	2020-07-17 14:50:26 +00:00
Icecream95	c20d166b4e	pan/mdg: Do per-sample framebuffer loads EXT_shader_framebuffer_fetch requires the fetched value to be per-sample, so we need to load the sample id when in a fragment shader. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5930>	2020-07-17 14:34:47 +00:00
Jesse Natalie	0e90b3d0c4	nir: Support load/store of temps as scratch in nir_lower_explicit_io Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5889>	2020-07-14 18:15:40 +00:00
Jesse Natalie	99aaf0ec18	nir: When nir_lower_vars_to_explicit_types is run on temps, update scratch_size To allow interop with other scratch ops, append any remaining temp vars to the end of any already-allocated scratch space. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5889>	2020-07-14 18:15:40 +00:00
Jesse Natalie	bf138c1fd4	nir_lower_io: Add addr_format_is_offset helper Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5889>	2020-07-14 18:15:40 +00:00
Rhys Perry	7ba645d5cb	nir/lower_subgroups: add lower_shuffle_to_swizzle_amd masked_swizzle_amd can be much faster than shuffle. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5695>	2020-07-13 14:11:50 +00:00
Rhys Perry	9c317cb278	nir/lower_subgroups: pass options struct to lower_shuffle Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5695>	2020-07-13 14:11:50 +00:00
Icecream95	2e3a589e6c	nir: Add a base value to load_raw_output_pan This is the render target the read instruction uses. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5755>	2020-07-13 13:35:10 +00:00
Jason Ekstrand	351b5137d7	spirv: Allow block-decorated struct types for constants Whenever a struct type is decorated Block or BufferBlock we turn that into a GLSL_TYPE_INTERFACE. Since these decorations can end up random places, we should allow them for constants. Closes: #3252 Fixes: `9d0ae777dd` "spirv: Use interface type for block and buffer..." Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5855>	2020-07-12 00:02:45 +00:00
Mike Blumenkrantz	b8df1c43d2	nir: allow nir_lower_clip_halfz to run in geometry shaders the final output of gl_Position needs this transform, and geometry shaders must write this value for stream 0 if rasterization is enabled Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5851>	2020-07-11 07:32:25 +00:00
Mike Blumenkrantz	3fe87a5836	nir: allow nir_lower_point_size_mov to run in geometry shader geometry shaders may need to emit PSIZ as well Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5851>	2020-07-11 07:32:24 +00:00
Jason Ekstrand	29cba3b695	nir/validate: Don't abort() until after the shader has printed In the case where SSA use/def chains are broken, NIR prints out a very cryptic error and then aborts. This abort happens during validation rather than after the print is complete, hiding any other errors that may have been found. One might think, "So what? Fix your use/def issue first." However, what makes this especially bad is that, when use/def chains are broken, there's usually a much nicer error inline in the shader that would have been printed had we not aborted early so the current behavior simply ensures you get the most cryptic error possible in an already difficult-to-debug case. While we're at it, we remove the one other case of abort() which is in the validation of phi instruction sources. Reviewed-by: Rob Clark <robclark@freedesktop.org> Tested-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5809>	2020-07-08 19:51:58 +00:00
Mike Blumenkrantz	1fd3563025	nir: add lowering pass for fragcolor -> fragdata this is needed for zink and other drivers which can support fragcolor but not fragdata and want to correctly handle EXT_multiview_draw_buffers Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5687>	2020-07-08 14:51:34 +00:00
Daniel Schürmann	9300a14ffb	nir: refactor nir_can_move_instr Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5622>	2020-07-07 19:24:28 +02:00
Daniel Schürmann	09d0e06c5c	nir: also move vecN in case of nir_move_copies Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5622>	2020-07-07 19:24:28 +02:00
Neil Roberts	121b82f638	nir: Add intrinsics for the line width The first intrinsic is intended to expose the value set by glLineWidth to shaders internally. The second intrinsic exposes the value actually sent to the hardware. This may be wider than the first one in order to implement anti-aliasing. These will be used in later patches to implement a line smoothing lowering pass. v2: Add a second intrinsic for the expanded line width for anti-aliasing. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5624>	2020-07-06 21:59:16 +00:00
Neil Roberts	14dd65bb5b	compiler: Add a system value for the line coord The line coord is a coordinate along the axis perpendicular to the line. It is in the range [0,1] between the two edges of the line. It is available at least on Broadcom hardware. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5624>	2020-07-06 21:59:15 +00:00
Jason Ekstrand	a6ed1d7fa5	nir: Add docs to nir_lower[_explicit]_io Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5418>	2020-07-06 19:54:30 +00:00
Jason Ekstrand	0bc5a829dd	nir: Remove shared support from lower_io No drivers are using this anymore so we can delete it and not keep maintaining this legacy code-path. If any drivers want this in the future, they should use nir_lower_varst_to_explicit_types followed by nir_lower_explicit_io. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5418>	2020-07-06 19:54:30 +00:00
Jason Ekstrand	be96b069ad	nir: Assert that nir_lower_io is only called with allowed modes Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5418>	2020-07-06 19:54:30 +00:00
Connor Abbott	12e18d9e7a	nir: add vec2_index_32bit_offset address format For turnip, we use the "bindless" model on a6xx. Loads and stores with the bindless model require a bindless base, which is an immediate field in the instruction that selects between 5 different 64-bit "bindless base registers", a 32-bit descriptor index that's added to the base, and the usual 32-bit offset. The bindless base usually, but not always, corresponds to the Vulkan descriptor set. We can handle the case where the base is non-constant by using a bunch of if-statements, to make it a little easier in core NIR, and this seems to be what Qualcomm's driver does too. Therefore, the pointer format we need to use in NIR has a vec2 index, for the bindless base and descriptor index. Plumb this format through core NIR. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5683>	2020-07-06 16:44:15 +00:00
Connor Abbott	7ab7316003	nir: Refactor load/store intrinsic helper Add the possibility to specify the source components. This is necessary to let the UBO/SSBO index have more than one component, and it also lets us remove a few hand-rolled load intrinsic definitions. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5683>	2020-07-06 16:44:15 +00:00
Mike Blumenkrantz	fb2fe802f6	nir: add lowering pass for clip plane enabling a pass which rewrites gl_ClipDistance[n] to an undef if the corresponding clip plane is disabled in the rasterizer state this pass is needed for zink to handle api disables of clip planes Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5529>	2020-07-03 08:56:30 +00:00
Timothy Arceri	dbf016e259	nir: fix implicit fallthrough warnings Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5705>	2020-07-02 23:52:52 +00:00
Ian Romanick	8591adea38	nir/algebraic: Don't distrubte absolute-value into dot-products Dot product is multiplication followed by addition, and absolute value does not distribute into addition. Only vec4 platforms are affected by this change as scalar-only platforms never have any of the fdot_replicated instructions. In the shader-db results, below, shaders in MANY different applications are affected. Trine, Doom3, Enemy Territory: Quake Wars, Counter Strike: Global Offensive, Mad Max, Metro Last Light, and on and on... I'm really shocked that there were no test regressions! All Haswell and earlier platforms had similar results. (Haswell shown) total instructions in shared programs: 16219743 -> 16219820 (<.01%) instructions in affected programs: 12171 -> 12248 (0.63%) helped: 1 HURT: 78 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.78% max: 0.78% x̄: 0.78% x̃: 0.78% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.35% max: 2.38% x̄: 0.91% x̃: 1.06% 95% mean confidence interval for instructions value: 0.92 1.03 95% mean confidence interval for instructions %-change: 0.78% 1.00% Instructions are HURT. total cycles in shared programs: 538481383 -> 538491045 (<.01%) cycles in affected programs: 470796 -> 480458 (2.05%) helped: 149 HURT: 142 helped stats (abs) min: 1 max: 1338 x̄: 71.13 x̃: 4 helped stats (rel) min: 0.06% max: 40.99% x̄: 2.76% x̃: 0.67% HURT stats (abs) min: 1 max: 2092 x̄: 142.68 x̃: 12 HURT stats (rel) min: 0.07% max: 55.38% x̄: 5.07% x̃: 1.07% 95% mean confidence interval for cycles value: -5.28 71.69 95% mean confidence interval for cycles %-change: -0.07% 2.19% Inconclusive result (value mean confidence interval includes 0). Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Fixes: `62795475e8` ("nir/algebraic: Distribute source modifiers into instructions") Closes: #3129 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5581>	2020-07-02 14:05:33 -07:00
Timothy Arceri	d55aa78615	nir: add missing break to nir_opt_access() Fixes: `f2d0e48ddc` ("glsl/nir: Add optimization pass for access flags") Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5714>	2020-07-02 12:11:30 +10:00
Alyssa Rosenzweig	54d7907c27	nir: Propagate 216 conversions into vectors If we have code like: ('f2f16', ('vec2', ('f2f32', 'a@16'), '#b@32')) We would like to eliminate the conversions, but the existing rules can't see into the the (heterogenous) vector. So instead of trying to eliminate in one pass, we add opts to propagate the f2f16 into the vector. Even if nothing further happens, this is often a win since then the created vector is smaller (half2 instead of float2). Hence the above gets transformed to ('vec2', ('f2f16', ('f2f32', 'a@16')), ('f2f16', '#b@32')) Then the existing f2f16(f2f32) rule will kick in for the first component and constant folding will for the second and we'll be left with ('vec2', 'a@16', '#b@16') ...eliminating all conversions. v2: Predicate on !options->vectorize_vec2_16bit. As discussed, this optimization helps greatly on true vector architectures (like Midgard) but wreaks havoc on more modern SIMD-within-a-register architectures (like Bifrost and modern AMD). So let's predicate on that. v3: Extend for integers as well and add a comment explaining the transforms. Results on Midgard (unfortunately a true SIMD architecture): total instructions in shared programs: 51359 -> 50963 (-0.77%) instructions in affected programs: 4523 -> 4127 (-8.76%) helped: 53 HURT: 0 helped stats (abs) min: 1 max: 86 x̄: 7.47 x̃: 6 helped stats (rel) min: 1.71% max: 28.00% x̄: 9.66% x̃: 7.34% 95% mean confidence interval for instructions value: -10.58 -4.36 95% mean confidence interval for instructions %-change: -11.45% -7.88% Instructions are helped. total bundles in shared programs: 25825 -> 25670 (-0.60%) bundles in affected programs: 2057 -> 1902 (-7.54%) helped: 53 HURT: 0 helped stats (abs) min: 1 max: 26 x̄: 2.92 x̃: 2 helped stats (rel) min: 2.86% max: 30.00% x̄: 8.64% x̃: 8.33% 95% mean confidence interval for bundles value: -3.93 -1.92 95% mean confidence interval for bundles %-change: -10.69% -6.59% Bundles are helped. total quadwords in shared programs: 41359 -> 41055 (-0.74%) quadwords in affected programs: 3801 -> 3497 (-8.00%) helped: 57 HURT: 0 helped stats (abs) min: 1 max: 57 x̄: 5.33 x̃: 4 helped stats (rel) min: 1.92% max: 21.05% x̄: 8.22% x̃: 6.67% 95% mean confidence interval for quadwords value: -7.35 -3.32 95% mean confidence interval for quadwords %-change: -9.54% -6.90% Quadwords are helped. total registers in shared programs: 3849 -> 3807 (-1.09%) registers in affected programs: 167 -> 125 (-25.15%) helped: 32 HURT: 1 helped stats (abs) min: 1 max: 3 x̄: 1.34 x̃: 1 helped stats (rel) min: 20.00% max: 50.00% x̄: 26.35% x̃: 20.00% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 16.67% max: 16.67% x̄: 16.67% x̃: 16.67% 95% mean confidence interval for registers value: -1.54 -1.00 95% mean confidence interval for registers %-change: -29.41% -20.69% Registers are helped. total threads in shared programs: 2471 -> 2520 (1.98%) threads in affected programs: 49 -> 98 (100.00%) helped: 25 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.96 x̃: 2 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% 95% mean confidence interval for threads value: 1.88 2.04 95% mean confidence interval for threads %-change: 100.00% 100.00% Threads are [helped]. total spills in shared programs: 168 -> 168 (0.00%) spills in affected programs: 0 -> 0 helped: 0 HURT: 0 total fills in shared programs: 186 -> 186 (0.00%) fills in affected programs: 0 -> 0 helped: 0 HURT: 0 Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4999>	2020-06-30 16:21:33 +00:00
Boris Brezillon	cff418cc4c	nir: Add new rules to optimize NOOP pack/unpack pairs nir_load_store_vectorize_test.ssbo_load_adjacent_32_32_64_64 expectations need to be fixed accordingly. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5589>	2020-06-29 09:18:26 +02:00
Kenneth Graunke	6a5fb31fef	nir: Fix divergence analysis for tessellation input/outputs The load_per_vertex_{input,output} intrinsics simply mean that they're reading an arrayed input/output, which have one element per invocation. Most accesses to those use gl_InvocationID as the subscript. However, it's totally possible to read any element of the array. For example, an evaluation shader might read gl_in[2].gl_Position, or a control shader might read output[0]. For threads processing a single patch, an input/output load is convergent if and only if both sources (the per-vertex-array subscript and the offset) are convergent. For threads processing multiple patches, we continued to mark them divergent. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5613>	2020-06-24 03:25:10 +00:00
Jose Maria Casanova Crespo	ba15bb383f	nir: only uniforms with dynamically_uniform offset are dynamically_uniform Previously all nir_intrinsic_load_uniform that were used as sources were considered to be dynamically_uniform but when offsets of load_uniform are indirect it can not be determined. This fixes artefacts in Google Maps 3D view in V3D. Fixes: `886d46b089` ("nir: Add a function to determine if a source is dynamically uniform") Reviewed-by: Neil Roberts <nroberts@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5587>	2020-06-23 13:04:04 +00:00
Rhys Perry	9a389322c4	nir: slight correction to cube_face_coord constant folding ACO does the division with a rcp and then a multiplication. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5547>	2020-06-22 10:28:40 +00:00
Neil Roberts	ed29b576cb	nir/scheduler: Add an option to specify what stages share memory for I/O The scheduler has code to handle hardware that shares the same memory for inputs and outputs. Seeing as the specific stages that need this is probably hardware-dependent, this patch makes it a configurable option instead of hard-coding it to everything but fragment shaders. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5561>	2020-06-22 08:23:06 +02:00
Neil Roberts	28e3209985	nir/schedule: Store a pointer to the scoreboard in nir_deps_state nir_deps_state is a struct used as a closure for calculating the dependencies. Previously it had two fields copied out of the scoreboard. The closure is initialised in two seperate places. In order to make it easier to access other members of the scoreboard in the callbacks in later patches, the closure now just contains a pointer to the scoreboard and the two bits of copied information are removed. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5561>	2020-06-22 08:23:06 +02:00
Neil Roberts	df8dc30cea	nir/scheduler: Handle nir_intrinsic_load_per_vertex_input load_per_vertex_input should probably be handled in the same way as a regular load_input. I think the nir_schedule pass was written before V3D had geometry shader support, so that is probably why it hasn’t taken this into account until now. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5561>	2020-06-22 08:23:06 +02:00
Karol Herbst	feb83f2f82	nir/lower_images: handle dec and inc Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5480>	2020-06-18 15:15:17 +00:00
Jason Ekstrand	20b6ee82ac	nir/intrinsics: Put the _intel intrinsics together at the end All the other driver-specific intrinsics are at the end of the file so Intel's should go there too. Reviewed-by: Sagar Ghuge<sagar.ghuge@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5503>	2020-06-16 20:07:33 +00:00
Rob Clark	167fa2887f	nir/validate: validate intr->num_components Validate that num_components is only set for vectorized instructions, to prevent other nir passes or driver backends from mistakenly relying on num_components for non-vectorized instructions. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5371>	2020-06-16 02:48:18 +00:00
Rob Clark	2e5b5d9584	nir/lower-atomics-to-ssbo: don't set num_components Of the possible intrinsics generated, only load_ssbo is vectorized (and store_ssbo is never generated) Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5371>	2020-06-16 02:48:18 +00:00

1 2 3 4 5 ...

2343 commits