fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-23 13:20:14 +01:00

Author	SHA1	Message	Date
Marek Olšák	fb73058ad2	mesa: add upper bound to limit program state var iterations State parameters are sometimes not perfectly sorted. This optimizes the number of iterations we have to do for fetch_state. Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8183>	2021-01-21 21:59:29 +00:00
Marek Olšák	0c77190b31	glsl: split gl_CurrentAttribFragMESA into elements This reduces the constant buffer size by eliminating unused elements because it's no longer a uniform array that the compiler can't split. This looks silly, but there is no other way because all elements must be globally declared, which means they can't be generated by a loop. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8183>	2021-01-21 21:59:29 +00:00
Marek Olšák	e3a7acf958	glsl: remove unused internal builtin gl_CurrentAttribVertMESA Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8183>	2021-01-21 21:59:29 +00:00
Marek Olšák	0eccba1ac0	mesa: flatten STATE_MATERIAL and STATE_LIGHTPROD tokens Flattening continue to get optimal code in fetch_state. This merges the "face" field with the "attrib" field using the combined MAT_ATTRIB_* enums. The outcome is that the inner switch statements can be flatten because we can use MAT_ATTRIB_* to index into the attrib array directly. With LightSource attributes that don't have two sides, more math is involved to get the correct index but it works out nicely too. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8183>	2021-01-21 21:59:29 +00:00
Marek Olšák	b4f3497786	mesa: remove STATE_INTERNAL Let's flatten the tokens to generate optimal code for fetch_state. There was only one name conflict: STATE_NORMAL_SCALE was used both as internal and non-internal. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8183>	2021-01-21 21:59:29 +00:00
Rhys Perry	a6d92eaf4f	nir/sink,nir/move: sink/move reorderable load_ssbo Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6490>	2021-01-21 18:07:03 +00:00
Rhys Perry	e200ce0996	nir/lower_io: fix array_length lowering if buffer is smaller than offset Matches SPIR-V -> NIR implementation of OpArrayLength. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8163>	2021-01-21 11:53:12 +00:00
Jesse Natalie	13b21156e4	nir: Work around MSVC x86 internal compiler error Fixes: `1fd8b466` ("nir,spirv: add sparse image loads") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4108 Acked-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8581>	2021-01-20 20:42:48 +00:00
Ilia Mirkin	a0f4affcf6	glsl: only expose int64 atomics when extension is enabled This limits the exposure of these functions to when the extension is available. Prevents crashes otherwise, as the rest of the infrastructure doesn't necessarily expect these functions when the extension is not available. Fixes: `40c1f9883e` ("mesa,glsl: add support for GL_NV_shader_atomic_int64") Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8533>	2021-01-16 18:21:03 +00:00
Mike Blumenkrantz	652e51e1f3	nir/lower_uniforms_to_ubo: set explicit_binding on uniform_0 this variable is always bound to buffer index 0, so the binding info here is actually useful Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7935>	2021-01-14 17:29:09 +00:00
Mike Blumenkrantz	491e7decad	util/set: add the found param to search_or_add this brings parity with the internal api Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8450>	2021-01-14 13:51:35 +00:00
Rhys Perry	dfe429eb41	nir/loop_unroll: unroll more aggressively if it can improve load scheduling Significantly improves performance of a Control compute shader. Also seems to increase FPS at the very start of the game by ~5% (RX 580, 1080p, medium settings, no MSAA). fossil-db (Sienna): Totals from 81 (0.06% of 139391) affected shaders: SGPRs: 3848 -> 4362 (+13.36%); split: -0.99%, +14.35% VGPRs: 4132 -> 4648 (+12.49%) CodeSize: 275532 -> 659188 (+139.24%) MaxWaves: 986 -> 906 (-8.11%) Instrs: 54422 -> 126865 (+133.11%) Cycles: 1057240 -> 750464 (-29.02%); split: -42.61%, +13.60% VMEM: 26507 -> 61829 (+133.26%); split: +135.56%, -2.30% SMEM: 4748 -> 5895 (+24.16%); split: +31.47%, -7.31% VClause: 1933 -> 6802 (+251.89%); split: -0.72%, +252.61% SClause: 1179 -> 1810 (+53.52%); split: -3.14%, +56.66% Branches: 1174 -> 1157 (-1.45%); split: -23.94%, +22.49% PreVGPRs: 3219 -> 3387 (+5.22%); split: -0.96%, +6.18% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6538>	2021-01-13 18:54:18 +00:00
Daniel Schürmann	08fbd5d454	nir/divergence_analysis: mark load_push_constant as uniform Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8439>	2021-01-12 14:46:13 +00:00
Mike Blumenkrantz	f7527f7f65	glcpp: disable 'windows' tests these timeout a lot Acked-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Acked-by: Adam Jackson <ajax@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8321>	2021-01-12 01:51:16 +00:00
Daniel Schürmann	bd8e84eb8d	nir: replace .lower_sub with .has_fsub and .has_isub This allows a more fine-grained control about whether a backend supports one of these instructions. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6597>	2021-01-11 19:13:51 +00:00
Daniel Schürmann	b3ce55b445	nir,vc4: Lower fneg to fmul(x, -1.0) This patch also replaces lower_negate with lower_ineg / lower_fneg. The fneg semantics have been clarified as of Version 1.5, Revision 1 of the SPIR-V specification, which means that the previous lowering to fsub is not a viable solution anymore, and is replaced with lowering to fmul(x, -1.0). Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6597>	2021-01-11 19:13:51 +00:00
Erico Nunes	faaba0d6af	nir/lower_vec_to_movs: don't vectorize unsupports ops If the instruction being coalesced would be vectorized but the target doesn't support vectorizing that op, skip coalescing. Reuse the callbacks from alu_to_scalar to describe which ops should not be vectorized. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6506>	2021-01-11 13:13:30 +00:00
Rhys Perry	b634d7f3e2	nir/opt_vectorize: fix srcs_equal() with two different non-const To match hash_alu_src(), this should return false if both are different non-const ssa defs. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8391>	2021-01-09 11:14:05 +00:00
Rhys Perry	bdf316ae7b	nir/opt_vectorize: fix typo in instr_can_rewrite() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8391>	2021-01-09 11:14:05 +00:00
Eric Anholt	670944ba04	nir/lower_locals_to_regs: Use the imul_imm helper instead of forcing it. Cleaned up a bit of addressing math in the shader I just had to debug. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8373>	2021-01-08 21:04:31 +00:00
Rhys Perry	f5adf27fb9	nir,radv: add and use nir_vectorize_tess_levels() fossil-db (Sienna): Totals from 1342 (0.97% of 138791) affected shaders: CodeSize: 3287996 -> 3269572 (-0.56%); split: -0.56%, +0.00% Instrs: 629896 -> 628191 (-0.27%); split: -0.31%, +0.04% Cycles: 2619244 -> 2612424 (-0.26%); split: -0.30%, +0.04% VMEM: 388807 -> 389273 (+0.12%); split: +0.14%, -0.02% SMEM: 90655 -> 90700 (+0.05%); split: +0.06%, -0.01% VClause: 21831 -> 21812 (-0.09%) PreVGPRs: 44155 -> 44058 (-0.22%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4202>	2021-01-07 16:34:53 +00:00
Rhys Perry	f199b7188b	nir/load_store_vectorize: add data as callback args Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4202>	2021-01-07 16:34:53 +00:00
Rhys Perry	00c8bec47b	nir: add nir_load_store_vectorize_options Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4202>	2021-01-07 16:34:53 +00:00
Rhys Perry	f4eb833a12	nir/load_store_vectorize: don't ignore subgroup memory barriers Not sure why I thought this was correct, but we should consider them for optimization purposes. Fixes: `ce9205c03b` ('nir: add a load/store vectorization pass') Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4202>	2021-01-07 16:34:53 +00:00
Rhys Perry	c73c246e05	nir: gather whether a compute shader uses non-quad subgroup intrinsics Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7918>	2021-01-07 15:01:02 +00:00
Rhys Perry	f7a5b8ed35	vtn: support SpvCapabilitySparseResidency Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7774>	2021-01-06 20:36:38 +00:00
Rhys Perry	7d1d4acbd5	nir/lower_tex: fix lower_tg4_offsets with sparse fetches Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7774>	2021-01-06 20:36:38 +00:00
Rhys Perry	2d2decc905	nir: add sparse_residency_code_and Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7774>	2021-01-06 20:36:38 +00:00
Rhys Perry	4cbdf9ec4d	nir,spirv: implement SpvOpImageSparseTexelsResident Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7774>	2021-01-06 20:36:38 +00:00
Rhys Perry	1fd8b46667	nir,spirv: add sparse image loads Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7774>	2021-01-06 20:36:38 +00:00
Rhys Perry	3a7972f72a	nir,spirv: add sparse texture fetches Like SPIR-V and GL_ARB_sparse_texture2, these return a residency code. It is placed in the destination after the rest of the result. If it's zero, then the texel is resident. Otherwise, it's not resident. Besides the larger destination and the residency code, sparse fetches work the same as normal fetches. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7774>	2021-01-06 20:36:38 +00:00
Rhys Perry	95819663b7	nir: allow 5 component vectors These will be useful for sparse texture instructions and image load intrinsics. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7774>	2021-01-06 20:36:38 +00:00
Rhys Perry	ba4a73a502	nir/tests: fix callback for load/store vectorizer tests Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7774>	2021-01-06 20:36:38 +00:00
Daniel Schürmann	22b89d9a52	nir/opt_vectorize: fix call to filter function Due to the typo, it could happen that instructions got further vectorized than intended. Fixes: `8eaf9c61d1` ('nir/opt_vectorize: don't hash filtered instructions') Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8352>	2021-01-06 19:03:07 +00:00
Christian Gmeiner	c0fe111d64	nir: use intrinsic builders Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8295>	2021-01-06 14:34:41 +00:00
Mike Blumenkrantz	b5fb66a5ed	nir: preserve explicit_binding in lower_atomics_to_ssbo it's important to be able to tell whether this is explicitly set by the user Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7489>	2021-01-06 12:56:09 +00:00
Jesse Natalie	4d83306a9a	nir: Update saturated float->int/uint conversion algorithm The mantissa for a float doesn't contain enough data to accurately represent the min/max values for some destination types. Instead of clamping before converting, clamp after converting when coming from floats. This improves conformance of CL conversions, specifically for float -> long/ulong with int64 emulation enabled. Refactors the limit determination from the clamp, so we can determine limits for the dest type (int/uint) in both the source (float) and dest type. The limit as a float is used for comparison, while the limit as a dest type is used for bcsel. Important note is that the comparison is inverted to fge instead of flt, so the bcsel chooses the direct int/uint over the converted float in the case where the comparison comes up equal, but the conversion can't produce the exact min/max value. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8256>	2021-01-05 19:46:25 +00:00
Alexander von Gluck IV	c7486c996e	glsl/builtin_functions: Rename int64 function to int64_avail * int64 is a core type on Haiku (and potentially other platforms) * rename to int64_avail matching other similar calls Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2021-01-04 21:18:55 -06:00
Ian Romanick	539c25c2da	nir/algebraic: Move the flrp -> bcsel rule earlier If multiple rules could match, the rule that appears first in the file is used. Only Tiger Lake and Ice Lake are affected. Other platforms either have a LRP instruction or can't run any shaders from shader-db that would benefit. v2: Fix issues created when this commit was rebased on top of `3c8934a644` ("nir/algebraic: add flrp patterns for 16 and 64 bits"). Noticed by Caio. Tiger Lake and Ice Lake had similar results. total instructions in shared programs: 20908672 -> 20908661 (<.01%) instructions in affected programs: 419 -> 408 (-2.63%) helped: 5 HURT: 0 helped stats (abs) min: 1 max: 3 x̄: 2.20 x̃: 3 helped stats (rel) min: 1.85% max: 3.19% x̄: 2.49% x̃: 2.65% 95% mean confidence interval for instructions value: -3.56 -0.84 95% mean confidence interval for instructions %-change: -3.24% -1.73% Instructions are helped. total cycles in shared programs: 473513940 -> 473513793 (<.01%) cycles in affected programs: 7176 -> 7029 (-2.05%) helped: 12 HURT: 0 helped stats (abs) min: 5 max: 22 x̄: 12.25 x̃: 12 helped stats (rel) min: 0.84% max: 3.24% x̄: 2.09% x̃: 1.80% 95% mean confidence interval for cycles value: -15.43 -9.07 95% mean confidence interval for cycles %-change: -2.57% -1.61% Cycles are helped. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00
Ian Romanick	ec16f935fe	nir/algebraic: Mark comparisons generated from lowered fsign precise This prevents other transformations from converting them to 'a != 0'. For example, both of these transformations can do this: (('~flt', 0.0, ('fabs', a)), ('fne', a, 0.0)), (('~flt', ('fneg', ('fabs', a)), 0.0), ('fne', a, 0.0)), Both fsign(fabs(NaN)) and fsign(fneg(fabs(NaN))) should produce zero, but, since 'NaN != 0.0' is true, cascading these transformations could cause them to generate 1.0 or -1.0 respecively. No shader-db or fossil-db changes on any Intel platform. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00
Ian Romanick	9771af5dde	nir/algebraic: Fix broken NaN and -0.0 behavior No shader-db or fossil-db changes on any Intel platform. v2: Add a coding line to fix SCons build problems caused by the ± character. Fixes: `25bfba3335` ("nir/algebraic: Recognize open-coded copysign(1.0, a)") Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00
Ian Romanick	010e663cc3	spir-v: Mark floating point comparisons exact OpenGL GLSL, OpenGL ARB assembly shaders, and DX9 are pretty loose about the behavior in the presence of NaNs. Many GPUs that implement these specifications do not even have a representation of NaN. However, OpenCL and Vulkan SPIR-V are not so lax. Both actually have some required behavior in the presence of NaN, and, of the two, OpenCL is the most strict. For years we have implemented SPIR-V by using the same comparison opcodes as we use for OpenGL GLSL and OpenGL assembly shaders. This has repeatedly caused problems where an optimization that is valid in the NaN-relaxed world is not valid in Vulkan or OpenCL. To fix this, set the "exact" flag on comparisons instructions generated from SPIR-V. This will block optimizations that may have different NaN behavior. v2: Set the exact flag in the nir_builder, not in the vtn_builder. v3: Add an assertion in vtn_handle_constant that the exact flag wasn't set (because it's ignored). Rebase on `80163bbec3` ("nir/vtn: Support OpOrdered and OpUnordered opcodes"). Mark the NIR generated for those opcodes as exact as well. v4: s/unused_exact/exact/ in a couple places, and assert that exact has the expected value (true in one place, false in the other). Suggested by Caio. Closes: #3345 Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Tested-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Fixes: `8513b12590` ("nir/opt_if: split ALU from Phi more aggressively") This commit doesn't really fix anything in `8513b12590`. However, without `8513b12590`, a regression is triggered in RADV on No Man's Sky. I want to ensure that this change is only applied on top of `8513b12590`, and Fixes: seems the safest way to do that. No shader-db changes on any Intel platform. This only affects SPIR-V, and we have no OpenGL SPIR-V shaders in shader-db. 124 shaders in Shadow of the Tomb Raider (Steam "native") were hurt by 1 spill and 1 fill each. All Intel platforms had similar results. (Tiger Lake shown) Instructions in all programs: 155668276 -> 155685764 (+0.0%) SENDs in all programs: 6474570 -> 6474570 (+0.0%) Loops in all programs: 35271 -> 35271 (+0.0%) Cycles in all programs: 3198055373 -> 3198628031 (+0.0%) Spills in all programs: 231522 -> 231646 (+0.1%) Fills in all programs: 347571 -> 347695 (+0.0%) Vega Totals: SGPRs: 20955712 -> 20956756 (+0.00%); split: -0.02%, +0.03% VGPRs: 13476920 -> 13473132 (-0.03%); split: -0.07%, +0.04% CodeSize: 613371940 -> 613339348 (-0.01%); split: -0.06%, +0.05% MaxWaves: 3111886 -> 3112481 (+0.02%); split: +0.02%, -0.00% Instrs: 120723785 -> 120746991 (+0.02%); split: -0.04%, +0.06% Cycles: 626658992 -> 626862708 (+0.03%); split: -0.05%, +0.08% VMEM: 216330854 -> 216343196 (+0.01%); split: +0.04%, -0.04% SMEM: 32079391 -> 32081972 (+0.01%); split: +0.05%, -0.04% VClause: 2688784 -> 2688789 (+0.00%); split: -0.03%, +0.03% SClause: 6554669 -> `6556251` (+0.02%); split: -0.01%, +0.03% Copies: 5356667 -> 5353283 (-0.06%); split: -0.36%, +0.29% Branches: 954466 -> 954716 (+0.03%); split: -0.01%, +0.04% PreSGPRs: 9078300 -> 9081626 (+0.04%); split: -0.01%, +0.05% PreVGPRs: 10972090 -> 10966576 (-0.05%); split: -0.06%, +0.01% Totals from 48239 (12.08% of 399432) affected shaders: SGPRs: 2713984 -> 2715028 (+0.04%); split: -0.16%, +0.19% VGPRs: 1997804 -> 1994016 (-0.19%); split: -0.46%, +0.27% CodeSize: 172094092 -> 172061500 (-0.02%); split: -0.21%, +0.19% MaxWaves: 337327 -> 337922 (+0.18%); split: +0.20%, -0.02% Instrs: 33053657 -> 33076863 (+0.07%); split: -0.15%, +0.22% Cycles: 254961228 -> 255164944 (+0.08%); split: -0.12%, +0.20% VMEM: 15165226 -> 15177568 (+0.08%); split: +0.59%, -0.51% SMEM: 3304938 -> 3307519 (+0.08%); split: +0.49%, -0.41% VClause: 766225 -> 766230 (+0.00%); split: -0.12%, +0.12% SClause: 1332645 -> 1334227 (+0.12%); split: -0.04%, +0.16% Copies: 2040651 -> 2037267 (-0.17%); split: -0.94%, +0.77% Branches: 743668 -> 743918 (+0.03%); split: -0.01%, +0.05% PreSGPRs: 1697667 -> 1700993 (+0.20%); split: -0.07%, +0.27% PreVGPRs: 1718424 -> 1712910 (-0.32%); split: -0.39%, +0.07% Polaris Totals: SGPRs: 21349172 -> 21354376 (+0.02%); split: -0.02%, +0.04% VGPRs: 13690680 -> 13686920 (-0.03%); split: -0.07%, +0.04% CodeSize: 613745824 -> 613704988 (-0.01%); split: -0.06%, +0.05% MaxWaves: 2775012 -> 2775189 (+0.01%); split: +0.01%, -0.00% Instrs: 120735079 -> 120756209 (+0.02%); split: -0.04%, +0.06% Cycles: 627906100 -> 628076156 (+0.03%); split: -0.05%, +0.08% VMEM: 216623065 -> 216641838 (+0.01%); split: +0.04%, -0.04% SMEM: 32295618 -> 32299338 (+0.01%); split: +0.05%, -0.04% VClause: 2711025 -> 2711141 (+0.00%); split: -0.03%, +0.04% SClause: 6545185 -> 6546769 (+0.02%); split: -0.01%, +0.03% Copies: 5387723 -> 5383249 (-0.08%); split: -0.37%, +0.29% Branches: 953775 -> 953954 (+0.02%); split: -0.01%, +0.03% PreSGPRs: 9148814 -> 9153211 (+0.05%); split: -0.01%, +0.06% PreVGPRs: 11029429 -> 11023915 (-0.05%); split: -0.06%, +0.01% Totals from 48239 (12.00% of 402052) affected shaders: SGPRs: 2682056 -> 2687260 (+0.19%); split: -0.16%, +0.35% VGPRs: 1994436 -> 1990676 (-0.19%); split: -0.46%, +0.27% CodeSize: 170857060 -> 170816224 (-0.02%); split: -0.21%, +0.19% MaxWaves: 295429 -> 295606 (+0.06%); split: +0.07%, -0.01% Instrs: 32808802 -> 32829932 (+0.06%); split: -0.16%, +0.22% Cycles: 254633252 -> 254803308 (+0.07%); split: -0.13%, +0.20% VMEM: 14897934 -> 14916707 (+0.13%); split: +0.65%, -0.52% SMEM: 3289726 -> 3293446 (+0.11%); split: +0.53%, -0.42% VClause: 775318 -> 775434 (+0.01%); split: -0.11%, +0.13% SClause: 1304867 -> 1306451 (+0.12%); split: -0.04%, +0.16% Copies: 2026334 -> 2021860 (-0.22%); split: -0.99%, +0.77% Branches: 742554 -> 742733 (+0.02%); split: -0.02%, +0.04% PreSGPRs: 1690887 -> 1695284 (+0.26%); split: -0.07%, +0.33% PreVGPRs: 1717709 -> 1712195 (-0.32%); split: -0.40%, +0.07% Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00
Ian Romanick	55621c6d1c	nir/algebraic: Add some compare-with-zero optimizations that are exact This prevents some fossil-db regressions in "spir-v: Mark floating point comparisons exact". v2: Note that the patterns and replacements produce the same value when isnan(b). Suggested by Caio. v3: Use C99 isfinite() instead of (obsolete) BSD finite(). Fixes various Windows builds. No fossil-db changes on any Inetl platform, Vega, or Polaris10. All Intel platforms had similar results. (Tiger Lake shown) total instructions in shared programs: 20908670 -> 20908672 (<.01%) instructions in affected programs: 69 -> 71 (2.90%) helped: 0 HURT: 1 total cycles in shared programs: 473515288 -> 473513940 (<.01%) cycles in affected programs: 4942 -> 3594 (-27.28%) helped: 2 HURT: 0 Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00
Ian Romanick	9167324a86	nir/algebraic: Mark some logic-joined comparison reductions as exact This also prevents some fossil-db regressions in "spir-v: Mark floating point comparisons exact". v2: Mark the fmin / fmax in the replacement exact to prevent other optimizations from ruining the NaN-clensing property of the fmin / fmax. Suggested by Rhys. Don't assume that constants are not NaN because some components of a vector might be NaN while others are numbers. Noticed by Rhys. This causes ~8 more shaders in Age of Wonders III (dxvk) to regress on cycles (not instructions) by less than 1% when "spir-v: Mark floating point comparisons exact" is applied. This difference is too small to care. All Intel platforms had similar results. (Tiger Lake shown) total instructions in shared programs: 20908668 -> 20908670 (<.01%) instructions in affected programs: 9196 -> 9198 (0.02%) helped: 10 HURT: 5 helped stats (abs) min: 1 max: 2 x̄: 1.40 x̃: 1 helped stats (rel) min: 0.02% max: 5.41% x̄: 2.20% x̃: 2.16% HURT stats (abs) min: 2 max: 6 x̄: 3.20 x̃: 3 HURT stats (rel) min: 2.44% max: 16.67% x̄: 9.39% x̃: 12.50% 95% mean confidence interval for instructions value: -1.22 1.49 95% mean confidence interval for instructions %-change: -2.08% 5.41% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 473515330 -> 473515288 (<.01%) cycles in affected programs: 67146 -> 67104 (-0.06%) helped: 10 HURT: 7 helped stats (abs) min: 1 max: 36 x̄: 15.90 x̃: 17 helped stats (rel) min: 0.01% max: 1.29% x̄: 0.66% x̃: 0.89% HURT stats (abs) min: 1 max: 48 x̄: 16.71 x̃: 4 HURT stats (rel) min: 0.08% max: 1.94% x̄: 0.87% x̃: 0.19% 95% mean confidence interval for cycles value: -13.88 8.94 95% mean confidence interval for cycles %-change: -0.56% 0.49% Inconclusive result (value mean confidence interval includes 0). Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00
Ian Romanick	71961c73a9	nir: Correctly constant fold fsign(NaN) and fsign(-0) GLSL and SPIR-V GLSL.std.450 don't have any requirements for fsign(NaN), and both only require that FSign(-0.0) == 0.0. OpenCL, on the other hand, requires sign(-0.0) be exactly -0.0. It also requires that sign(NaN) be exactly 0.0. In practice, this change is difficult to test. Our GLSL frontend already constant folds sign(NaN) to 0.0 before even getting to NIR. As far as I can tell, glslang does the same. I don't have a good way to run an OpenCL SPIR-V test. Maybe SPIR-V GLSL.std.450 assembly? No shader-db or fossil-db changes on any Intel platform. Acked-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00
Ian Romanick	fe3c518277	nir/algebraic: Don't add reordered version of patterns for commutative instructions The reordered are automatically considered by nir_algebraic rules for commutative instructions. No shader-db or fossil-db changes on any Intel platform. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00
Ian Romanick	314a40c902	Revert "nir: Replace an odd comparison involving fmin of -b2f" I originally noticed that `3b30814791` ("nir/algebraic: Optimize 1-bit Booleans") caused this pattern no longer be matched by incorrectly replacing b@32 with b@1. Making that correct had no effect on shader-db. When this pattern originally was added, it only affected 4 shaders, so it's not worth the effort to debug further. This reverts commit `f50400cc80`. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00
Ian Romanick	aec0547838	nir/algebraic: Make some notes about comparison rearrangements versus infinity The original comment was a little terse and a little incorrect. The rearrangements are fine w.r.t. NaN. However, they produce incorrect results if one operand is +Inf and the other is -Inf. A later commit, "nir/algebraic: Add some compare-with-zero optimizations that are exact", will add some more patterns here. It may be reasonable to squash this commit (forward) into that commit. v2: Fix some incorrect comparisons operators in the comment (<= vs >=). Add commentary that subtraction works like addition w.r.t. NaN. Both noticed / suggested by Caio. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00
Ian Romanick	363efc2823	nir: Make some notes about fsign versus NaN This commit only documents the current behavior, even if that behavior is not the behavior preferred by the relevant specs. In SPIR-V, there are two flavors of the sign instruction, and each lives in an extended instruction set. The GLSL.std.450 FSign instruction is defined as: Result is 1.0 if x > 0, 0.0 if x = 0, or -1.0 if x < 0. This also matches the GLSL 4.60 definition. However, the OpenCL.ExtendedInstructionSet.100 sign instruction is defined as: Returns 1.0 if x > 0, -0.0 if x = -0.0, +0.0 if x = +0.0, or -1.0 if x < 0. Returns 0.0 if x is a NaN. There are two differences. Each treats -0.0 differently, and each also treats NaN differently. Specifically, GLSL.std.450 FSign does not define any specific behavior for NaN. There has been some discussion in Khronos about the NaN behavior of GLSL.std.450 FSign. As part of that discussion, I did some research into how we treat NaN for nir_op_fsign, and this commit just captures some of those notes. v2: Document the expected behavior of nir_op_fsign more thoroughly. Suggested by Rhys. Note that the current implementation of constant folding does not produce the expected result for NaN. Suggested by Caio. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> [v1] Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00
Danylo Piliaiev	81132983cd	nir: fix missing nir_lower_pntc_ytransform.c in the makefile Fixes: `33fd9e5d` "nir: account for point-coord origin when lowering it" Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8308>	2021-01-04 15:37:20 +00:00

... 3 4 5 6 7 ...

6049 commits