fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-21 04:38:09 +02:00

Author	SHA1	Message	Date
Eric Anholt	8bd0cc1a5a	nir/vec_to_movs: Don't generate MOVs for undef channels. This appeared in softpipe's image operations, since NIR always uses 4-component values for the coords, while the GLSL IR only has 2 components for a 2D image (for example). arb_shader_image_load_store-shader-mem-barrier (which times out in CI and spends its time inside of tgsi_exec) was spending 4/51 of its instructions on moving these undefs around. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9345>	2021-03-03 00:51:44 +00:00
Eric Anholt	1e5ef4c60c	nir: Add a nir_src_is_undef() helper, like nir_src_is_const(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9345>	2021-03-03 00:51:44 +00:00
Gert Wollny	935d9e6863	nir: disaallow reordering for r600 shared load and remove component field The original shared load op can't be reordered, so it might be better to also not allow this for the lowered variant. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Acked-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9330>	2021-03-02 18:46:17 +01:00
Rhys Perry	812dd9c9f6	nir/copy_prop: use nir_{instr,if}_rewrite_{src,condition}_ssa Compile-time (nir_copy_prop): Difference at 95.0% confidence -2470.88 +/- 19.8762 -35.7461% +/- 0.247259% (Student's t, pooled s = 23.4747) Compile-time (overall): Difference at 95.0% confidence -2175.72 +/- 178.786 -1.73627% +/- 0.140826% (Student's t, pooled s = 211.155) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8784>	2021-03-01 17:38:10 +00:00
Rhys Perry	c2209d836c	nir/copy_prop: visit copies instead of sources There are less copy instructions than sources, so instead of visiting each source and rewriting it if it's uses a copy instruction, visit each copy instruction and rewrite it's users. Besides improving compile time, this also has a side effect of fixing a rare situation where copy-propagation does not happen: loop { a = phi ..., b c = vec ... b = mov c.y } It might have been the case that a phi source could not be rewritten until the copy was visited later. Compile-time (nir_copy_prop): Difference at 95.0% confidence -2613.13 +/- 15.2094 -27.4333% +/- 0.150247% (Student's t, pooled s = 17.963) Comple-time (overall): Difference at 95.0% confidence -2627.89 +/- 201.557 -2.05404% +/- 0.156221% (Student's t, pooled s = 238.048) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8784>	2021-03-01 17:38:10 +00:00
Rhys Perry	41125bff4f	nir/copy_prop: remove unused copies These were hurting performance of other passes. Compile-time (overall): Difference at 95.0% confidence -5496.3 +/- 219.752 -4.11912% +/- 0.160285% (Student's t, pooled s = 259.538) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8784>	2021-03-01 17:38:10 +00:00
Rhys Perry	ed9c3c4f19	nir: add nir_ssa_def_is_unused() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8784>	2021-03-01 17:38:10 +00:00
Rhys Perry	f66a7240f9	nir: fix build at -O1 At -O1 with GCC 10.2.1, _nir_visit_dest_indirect (declared ALWAYS_INLINE) will fail to inline if it's caller (nir_foreach_dest) is not inlined, because _nir_visit_dest_indirect is passed as a function pointer. This results in a compilation error. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com> Fixes: `336bcbacd0` ("nir: inline nir_foreach_{src,dest}") Tested-by: Witold Baryluk <witold.baryluk@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4353 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9301>	2021-02-26 21:54:53 +00:00
Rob Clark	a9618e7c42	util: Add accessor for util_cpu_caps In release builds, there should be no change, but in debug builds the assert will help us catch undefined behavior resulting from using util_cpu_caps before it is initialized. With fix for u_half_test for MSVC from Jesse Natalie squashed in. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9266>	2021-02-26 18:31:19 +00:00
Gert Wollny	e5db9c3dd4	nir: Add r600 specific CUBE opcode to evaluate cube texture coords and face The opcode evaluates tha unnormalized coordinates, the length of the major axis, and the cube face. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Acked-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9200>	2021-02-26 09:51:37 +01:00
Gert Wollny	4f4e1e5ed9	nir: Add flag to tex instruction to indicate lowering cube to array E.g. r600 a cube texture lookup uses a specific cube instruction to evaluate the sample coordinates and the face ID, so that the cube texture lookup can be lowered to a array texture lookup, thereby sharing the code with the 2D array texture lopkup. However, for TXD the given gradients still need to be three-component vectors, so add a flag that the NIR validation knows that we deal with cube texture that was lowered to an array and can validate accordingly. v2: Handle new flag in serialization (Marek) v3: Rebase so that the change does not require the patch to deduct the number of offset and grad components from sampler type Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2) Acked-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9200>	2021-02-26 09:51:37 +01:00
Vinson Lee	7b934d1ecd	nir/lower_tex: Change coord type to int. nir_tex_instr_src_index returns an int. Fix defect reported by Coverity Scan. Macro compares unsigned to 0 (NO_EFFECT) unsigned_compare: This greater-than-or-equal-to-zero comparison of an unsigned value is always true. coord >= 0U. Fixes: `b154a4154b` ("nir/lower_tex: rewrite tex/txb -> txd/txl before saturating srcs") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9181>	2021-02-25 04:15:07 +00:00
Mike Blumenkrantz	2e60929b47	nir/texcoord_replace: add a yinvert param vulkan needs to invert the y coord in order to handle PIPE_SPRITE_COORD_LOWER_LEFT Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9115>	2021-02-24 23:25:01 +00:00
Rhys Perry	71a985d80b	nir/dce: perform DCE for unlooped instructions in a single pass It's unnecessary to iterate twice for instructions outside loops. Compile-time (nir_opt_dce): Difference at 95.0% confidence -630.64 +/- 6.18761 -27.0751% +/- 0.223134% (Student's t, pooled s = 7.30785) Compile-time (entire run): Difference at 95.0% confidence -749.54 +/- 48.8272 -1.82644% +/- 0.117838% (Student's t, pooled s = 57.6672) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7691>	2021-02-24 09:58:59 +00:00
Rhys Perry	336bcbacd0	nir: inline nir_foreach_{src,dest} Compile-time (nir_opt_dce): Difference at 95.0% confidence -319.51 +/- 5.67632 -12.0627% +/- 0.208076% (Student's t, pooled s = 6.70399) Compile-time (overall): Difference at 95.0% confidence -385.025 +/- 42.1124 -0.929489% +/- 0.10139% (Student's t, pooled s = 49.7367) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7691>	2021-02-24 09:58:59 +00:00
Rhys Perry	325f627d88	nir/dce: replace instruction worklist with ssa def bitset Instead of a keeping a worklist of live instructions, use a bitset of live ssa defs and iterate over instructions in reverse. Compile-time (nir_opt_dce): Difference at 95.0% confidence -931.911 +/- 4.41383 -26.0263% +/- 0.105781% (Student's t, pooled s = 5.21293) Compile-time (overall): Difference at 95.0% confidence -882.245 +/- 28.3492 -2.08541% +/- 0.0665121% (Student's t, pooled s = 33.4818) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7691>	2021-02-24 09:58:59 +00:00
Christian Gmeiner	8cb52f6735	nir/lower_tex: wider usage of nir_tex_instr_src_index(..) Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Suggested-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8898>	2021-02-23 14:04:30 +00:00
Christian Gmeiner	a403ff4d70	nir/lower_tex: 'txs free' tex_rect lowering GPUs without native txs support (and without an emulation in sw) can use this new lowering. Also it saves us from doing int/float conversions. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8898>	2021-02-23 14:04:30 +00:00
Christian Gmeiner	3fbde2fd93	nir: add has_txs flag Some nir lowerings might need to know if txs is supported by the backend. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8898>	2021-02-23 14:04:30 +00:00
Christian Gmeiner	b0e23c92b3	nir: add load_texture_rect_scaling Will be used in a different form of lower_rect tex lowering. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8898>	2021-02-23 14:04:30 +00:00
Ian Romanick	f2656569c6	nir/range_analysis: Handle vectors better in ssa_def_bits_used If a query is made of a vector ssa_def (possibly from an intermediate result), return all_bits. If a constant source is a vector, swizzle the correct component. Unit tests were added for the constant vector cases. I don't see a great way to make unit tests for the other cases. v2: Add a FINIHSME comment about u16vec2 hardware. Fixes: `96303a59ea` ("nir: Add some range analysis for used bits") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9123>	2021-02-22 22:37:17 +00:00
Ian Romanick	ce649e54f1	nir/range-analysis: C++ linkage Fixes: `96303a59ea` ("nir: Add some range analysis for used bits") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9123>	2021-02-22 22:37:17 +00:00
Timothy Arceri	9f474bd4b4	nir: handle negatives in ffma reassociation optimisation shader-db results Iris (BDW): total instructions in shared programs: 16632076 -> 16631057 (<.01%) instructions in affected programs: 48010 -> 46991 (-2.12%) helped: 47 HURT: 6 total cycles in shared programs: 915266726 -> 915263622 (<.01%) cycles in affected programs: 1182283 -> 1179179 (-0.26%) helped: 18 HURT: 27 total loops in shared programs: 4929 -> 4929 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total spills in shared programs: 18834 -> 18801 (-0.18%) spills in affected programs: 525 -> 492 (-6.29%) helped: 3 HURT: 0 total fills in shared programs: 23008 -> 22981 (-0.12%) fills in affected programs: 435 -> 408 (-6.21%) helped: 3 HURT: 0 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8608>	2021-02-22 00:49:13 +00:00
Ian Romanick	3250e04d25	nir/algebraic: Add some max/min optimizations with 3 variables Specifically, ARB assembly shaders with code like SLT r0, r0, c[0].xxxx; ... KIL r0.xyzx; can result in this pattern. The other cases (e.g., 'KIL r0.xxxx' and 'KIL r0.xyxx') are handled by existing patterns. Reviewed-by: Matt Turner <mattst88@gmail.com> All Intel platforms had similar results. (Tiger Lake shown) total instructions in shared programs: 21050098 -> 21050065 (<.01%) instructions in affected programs: 2062 -> 2029 (-1.60%) helped: 31 HURT: 1 helped stats (abs) min: 1 max: 3 x̄: 1.10 x̃: 1 helped stats (rel) min: 1.14% max: 4.35% x̄: 1.89% x̃: 1.69% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.65% max: 0.65% x̄: 0.65% x̃: 0.65% 95% mean confidence interval for instructions value: -1.23 -0.84 95% mean confidence interval for instructions %-change: -2.12% -1.50% Instructions are helped. total cycles in shared programs: 855105466 -> 855105055 (<.01%) cycles in affected programs: 50136 -> 49725 (-0.82%) helped: 33 HURT: 0 helped stats (abs) min: 3 max: 22 x̄: 12.45 x̃: 12 helped stats (rel) min: 0.13% max: 1.57% x̄: 0.86% x̃: 0.92% 95% mean confidence interval for cycles value: -13.78 -11.13 95% mean confidence interval for cycles %-change: -0.97% -0.76% Cycles are helped. No fossil-db changes on any Intel platform. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9122>	2021-02-19 17:31:27 -08:00
Ian Romanick	d9b5bce85a	nir/algebraic: Remove some redundant b2f logic-op reduction patterns There are patterns that will re-write the fmin or fmax part into a form that other patterns will gradually convert to the same ior or iand. For example, fmax(b2f(a), b2f(b)) != 0 b2f(a \|\| b) != 0 a \|\| b No shader-db or fossil-db changes on any Intel platform. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9122>	2021-02-19 17:31:24 -08:00
Ian Romanick	7e127c1fca	nir/algebraic: Fix some min/max of b2f replacements fmin(-A, -B) is -fmax(A, B), and fmax(-A, -B) is -fmin(A, B). Therefore the logic joining A and B should toggle between ior and iand for the negated versions. At the very least, a shader from Euro Truck Simulator 2 in shader-db is affected by this. The KIL instruction in the (ARB assembly) shader ends up with the wrong logic. This is _probably_ the source of https://gitlab.freedesktop.org/mesa/mesa/-/issues/1346. That said, the issue mentions that Mesa 18.0.5 works, but commit `68420d8322` ("nir: Simplify min and max of b2f") was added in 17.3. Moreover, I was not able to reproduce the error in the ETS2 shader from shader-db from any Mesa commit near the time the original fd.o bugzilla was submitted (December 2018). 🤷 In fact, the current error in that shader starts with `9167324a86` ("nir/algebraic: Mark some logic-joined comparison reductions as exact"). That's a bit of a red herring as `9167324a86` just sets off a chain of replacements that eventually leads to the incorrect min/max of b2f patterns fixed by this commit. The other affected shaders in the shader-db results are from Cargo Commander. These are also ARB assembly shaders. I think any ARB assembly shader that uses the pattern SLT r0, ...; ... KIL -r0; will suffer from issues related to this. This change fixes the piglit tests/spec/arb_fragment_program/kil-of-slt.shader_test test added in https://gitlab.freedesktop.org/mesa/piglit/-/merge_requests/454. shader-db results: All Gen6+ platforms had similar result. (Ice Lake shown) total instructions in shared programs: 20034604 -> 20034486 (<.01%) instructions in affected programs: 3885 -> 3767 (-3.04%) helped: 47 HURT: 2 helped stats (abs) min: 2 max: 4 x̄: 2.64 x̃: 2 helped stats (rel) min: 2.33% max: 8.33% x̄: 3.48% x̃: 3.39% HURT stats (abs) min: 3 max: 3 x̄: 3.00 x̃: 3 HURT stats (rel) min: 13.64% max: 16.67% x̄: 15.15% x̃: 15.15% 95% mean confidence interval for instructions value: -2.83 -1.99 95% mean confidence interval for instructions %-change: -3.84% -1.60% Instructions are helped. total cycles in shared programs: 979881379 -> 979879406 (<.01%) cycles in affected programs: 119873 -> 117900 (-1.65%) helped: 46 HURT: 3 helped stats (abs) min: 10 max: 756 x̄: 45.41 x̃: 26 helped stats (rel) min: 0.53% max: 19.72% x̄: 1.67% x̃: 1.26% HURT stats (abs) min: 28 max: 56 x̄: 38.67 x̃: 32 HURT stats (rel) min: 1.44% max: 3.54% x̄: 2.75% x̃: 3.27% 95% mean confidence interval for cycles value: -70.83 -9.70 95% mean confidence interval for cycles %-change: -2.23% -0.57% Cycles are helped. Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8115098 -> 8115076 (<.01%) instructions in affected programs: 2592 -> 2570 (-0.85%) helped: 32 HURT: 2 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.88% max: 2.70% x̄: 1.35% x̃: 1.31% HURT stats (abs) min: 5 max: 5 x̄: 5.00 x̃: 5 HURT stats (rel) min: 17.24% max: 18.52% x̄: 17.88% x̃: 17.88% 95% mean confidence interval for instructions value: -1.15 -0.15 95% mean confidence interval for instructions %-change: -1.83% 1.39% Inconclusive result (%-change mean confidence interval includes 0). total cycles in shared programs: 238189718 -> 238189802 (<.01%) cycles in affected programs: 75076 -> 75160 (0.11%) helped: 3 HURT: 31 helped stats (abs) min: 2 max: 130 x̄: 44.67 x̃: 2 helped stats (rel) min: 0.18% max: 5.70% x̄: 2.02% x̃: 0.19% HURT stats (abs) min: 2 max: 70 x̄: 7.03 x̃: 4 HURT stats (rel) min: 0.07% max: 6.41% x̄: 0.53% x̃: 0.15% 95% mean confidence interval for cycles value: -7.27 12.21 95% mean confidence interval for cycles %-change: -0.33% 0.94% Inconclusive result (value mean confidence interval includes 0). No fossil-db changes on any Intel platform. Fixes: `68420d8322` ("nir: Simplify min and max of b2f") Closes: #1346 Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9122>	2021-02-19 17:30:53 -08:00
Adam Jackson	fad353d7f8	nir: Silence a warning at -Og This throws a curious warning: In file included from ../src/compiler/nir/nir.h:32, from ../src/compiler/nir/nir_opt_if.c:24: ../src/compiler/nir/nir_opt_if.c: In function ‘opt_if_loop_last_continue’: ../src/compiler/glsl/list.h:415:64: warning: ‘nif’ may be used uninitialized in this function [-Wmaybe-uninitialized] 415 \| return !exec_list_is_empty(list) ? list->tail_sentinel.prev : NULL; \| ^ What's going on here is not enough of the optimizer has run to be able to prove that nif is always initialized. So just handle the "can't happen" case as if it could. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8724>	2021-02-18 20:59:43 +00:00
Mike Blumenkrantz	b154a4154b	nir/lower_tex: rewrite tex/txb -> txd/txl before saturating srcs this fixes mipmapping with saturate by saturating the coord param while passing an additional param (partial derivatives or lod) that uses the unsaturated coord value Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8756>	2021-02-18 14:32:05 +00:00
Daniel Schürmann	2e6c9e54f1	nir: lower is/load_helper to zero if no helper lanes are needed If there are no helper invocations required during the execution of the shader, we can assume that there also are no helper invocations active. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9058>	2021-02-17 21:53:52 +00:00
Daniel Schürmann	b689a65316	nir: lower load_helper to is_helper if the shader uses demote() load_helper_invocation is an Input Builtin, for which the value should not change during the execution of a shader. This new pass inserts an is_helper intrinsic before any demote() instruction and re-uses its value. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9058>	2021-02-17 21:53:52 +00:00
Alyssa Rosenzweig	2104135f38	nir: Fix grammar error Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9103>	2021-02-17 17:23:09 +00:00
Jason Ekstrand	12fa219768	nir/opt_large_constants: Handle generic pointers We already throw out any variables which may have a complex use so we just need to make sure that our mode checks don't assert if we have a deref which may_be but not must_be nir_var_function_temp. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9068>	2021-02-17 03:59:25 +00:00
Jason Ekstrand	8b133a1b25	nir: Fix parameter order in the bcsel-of-shuffle optimization Fixes: `4ff4d4e569` "nir/opt_intrinsic: Optimize bcsel(b, shuffle..." Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9068>	2021-02-17 03:59:25 +00:00
Jason Ekstrand	ceb6986d34	nir: Don't optimize bcsel-of-shuffle across blocks We can't move the shuffle to a new block so this only works if the shuffle and the bcsel are in the same block. Fortunately, in the motivating case, this is true. Also, we have to be careful around discard. We could try really hard to just avoid moving them past discard but we choose to simply bail if we see a discard instead. Fixes: `4ff4d4e569` "nir/opt_intrinsic: Optimize bcsel(b, shuffle..." Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9068>	2021-02-17 03:59:25 +00:00
Jason Ekstrand	2491d5a662	nir/algebraic: Covert up-cast of down-cast to extract on Intel This starts generating extract for bit sizes other than 32 but our back-end handles that just fine. Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8872>	2021-02-16 16:36:31 +00:00
Jason Ekstrand	f9b3be09e1	nir/algebraic: Clean up up-cast of down-cast when we can There are a bunch of cases where we can pretty quickly determine that the high bits don't matter. In these cases, delete the casts. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8872>	2021-02-16 16:36:31 +00:00
Jason Ekstrand	96303a59ea	nir: Add some range analysis for used bits This isn't 100% accurate, of course, but it should be good enough for what we're about to do with it. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8872>	2021-02-16 16:36:31 +00:00
Jason Ekstrand	d41ac6e2ca	nir/lower_bit_size: Support phi instructions Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8872>	2021-02-16 16:36:31 +00:00
Jason Ekstrand	6413e67591	nir: Add a couple helpers for phis and cursors Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8872>	2021-02-16 16:36:31 +00:00
Alyssa Rosenzweig	2afdcc187b	nir: Add sample_positions_pan intrinsic Facilites the gl_SamplePosition lowering on Bifrost, where the sample positions are accessed directly in a packed in-memory format. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8774>	2021-02-12 16:44:28 -05:00
Alyssa Rosenzweig	9f934e922d	compiler, nir: Add and set barrier metadata Useful for determining whether certain optimizations are legal for a compute shader (e.g. optimizing workgroup size in the driver). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6312>	2021-02-12 01:37:05 +00:00
Ian Romanick	ed138f2861	nir/algebraic: Partially revert `3f782cdd25` I'm not sure what the logic was, but there is no opportunity for anything to flush to zero here. 'a' is a Boolean value, and b2f produces 1.0 or 0.0. This was originally part of https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3765/. Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: Andres Gomez <agomez@igalia.com> Cc: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8910>	2021-02-07 18:31:01 -08:00
Ian Romanick	5923742356	nir/algebraic: add patterns for a >> #b << #b and a << #b >> #b Commit `5476d18183` ("nir/algebraic: add patterns for a >> #b << #b") added the ushr version, but it missed the ishr. A bunch of compute shaders with stores to shared storage generate the ishr pattern. Enabling this optimization also enables the iadd/iand reassociation (right after this hunk), and that enables merging of stores to shared storage. A couple shaders have spills and fills hurt on some platforms. These all occur in shaders that also have SENDs helped. On Gen9 and Gen11, the helped SENDs more than makes up for the extra spills and fills. On Gen7 and Gen8, it's not as clear. All of the shaders affected are compute shaders in DiRT Rally 2 or Bioshock Inifinite. The most affected Bioshock shader on Broadwell looks like: Before: CS SIMD8 shader: 1335 inst, 0 loops, 22411 cycles, 42:36 spills:fills, 159 sends, scheduled with mode lifo, Promoted 2 constants, compacted 21360 to 16528 bytes. After: CS SIMD8 shader: 1175 inst, 0 loops, 25916 cycles, 96:135 spills:fills, 72 sends, scheduled with mode lifo, Promoted 2 constants, compacted 18800 to 13648 bytes. The results on Haswell and Ivy Bridge are similar. Given that there are only 2 promoted constants, MR !7698 won't have any effect. There were no statistically significant changes on Gen9+ in Bioshock in our performance CI. Gen8 isn't in that CI, and DiRT Showdown 2 is also not included in that CI. It is possible that these shaders aren't used in the settings or demos used in the CI. The other pattern, which switches the order of the shifts, only helps a couple shaders. If I wasn't already adding another pattern, I definitely wouldn't bother with that one. v2: s/ishr/ushr/ in the replacement for the ushr pattern. Noticed by Rhys. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tiger Lake total instructions in shared programs: 21052760 -> 21049269 (-0.02%) instructions in affected programs: 59497 -> 56006 (-5.87%) helped: 46 HURT: 0 helped stats (abs) min: 2 max: 552 x̄: 75.89 x̃: 53 helped stats (rel) min: 0.28% max: 43.43% x̄: 5.87% x̃: 4.10% 95% mean confidence interval for instructions value: -108.96 -42.82 95% mean confidence interval for instructions %-change: -8.38% -3.35% Instructions are helped. total cycles in shared programs: 855229761 -> 855148518 (<.01%) cycles in affected programs: 8491373 -> 8410130 (-0.96%) helped: 33 HURT: 15 helped stats (abs) min: 42 max: 26940 x̄: 6200.70 x̃: 4329 helped stats (rel) min: 0.09% max: 38.78% x̄: 7.97% x̃: 4.29% HURT stats (abs) min: 2 max: 18132 x̄: 8225.33 x̃: 7288 HURT stats (rel) min: <.01% max: 13.37% x̄: 5.72% x̃: 4.53% 95% mean confidence interval for cycles value: -4331.52 946.40 95% mean confidence interval for cycles %-change: -6.78% -0.61% Inconclusive result (value mean confidence interval includes 0). total sends in shared programs: 989947 -> 989694 (-0.03%) sends in affected programs: 523 -> 270 (-48.37%) helped: 5 HURT: 0 helped stats (abs) min: 9 max: 87 x̄: 50.60 x̃: 37 helped stats (rel) min: 25.71% max: 54.72% x̄: 43.49% x̃: 42.53% 95% mean confidence interval for sends value: -93.95 -7.25 95% mean confidence interval for sends %-change: -58.48% -28.50% Sends are helped. Ice Lake and Skylake had similar results. (Ice Lake shown) total instructions in shared programs: 20033498 -> 20030552 (-0.01%) instructions in affected programs: 59220 -> 56274 (-4.97%) helped: 48 HURT: 0 helped stats (abs) min: 1 max: 465 x̄: 61.38 x̃: 39 helped stats (rel) min: 0.03% max: 42.27% x̄: 5.19% x̃: 3.90% 95% mean confidence interval for instructions value: -89.57 -33.18 95% mean confidence interval for instructions %-change: -7.49% -2.89% Instructions are helped. total cycles in shared programs: 979993675 -> 979840773 (-0.02%) cycles in affected programs: 6738454 -> 6585552 (-2.27%) helped: 46 HURT: 0 helped stats (abs) min: 42 max: 6265 x̄: 3323.96 x̃: 3579 helped stats (rel) min: 0.09% max: 37.38% x̄: 4.34% x̃: 2.39% 95% mean confidence interval for cycles value: -3664.70 -2983.21 95% mean confidence interval for cycles %-change: -6.63% -2.06% Cycles are helped. total spills in shared programs: 10659 -> 10661 (0.02%) spills in affected programs: 36 -> 38 (5.56%) helped: 1 HURT: 1 total fills in shared programs: 11551 -> 11551 (0.00%) fills in affected programs: 70 -> 70 (0.00%) helped: 1 HURT: 1 total sends in shared programs: 1032117 -> 1031785 (-0.03%) sends in affected programs: 711 -> 379 (-46.69%) helped: 5 HURT: 0 helped stats (abs) min: 18 max: 87 x̄: 66.40 x̃: 74 helped stats (rel) min: 27.69% max: 54.72% x̄: 44.49% x̃: 44.31% 95% mean confidence interval for sends value: -101.79 -31.01 95% mean confidence interval for sends %-change: -58.42% -30.55% Sends are helped. Broadwell total instructions in shared programs: 17865005 -> 17862757 (-0.01%) instructions in affected programs: 66438 -> 64190 (-3.38%) helped: 49 HURT: 0 helped stats (abs) min: 1 max: 266 x̄: 45.88 x̃: 39 helped stats (rel) min: 0.03% max: 11.99% x̄: 3.73% x̃: 3.92% 95% mean confidence interval for instructions value: -59.15 -32.61 95% mean confidence interval for instructions %-change: -4.35% -3.12% Instructions are helped. total cycles in shared programs: 1031298803 -> 1031219023 (<.01%) cycles in affected programs: 7253602 -> 7173822 (-1.10%) helped: 45 HURT: 2 helped stats (abs) min: 18 max: 7828 x̄: 1928.33 x̃: 1918 helped stats (rel) min: <.01% max: 10.51% x̄: 1.58% x̃: 1.31% HURT stats (abs) min: 3490 max: 3505 x̄: 3497.50 x̃: 3497 HURT stats (rel) min: 15.56% max: 15.64% x̄: 15.60% x̃: 15.60% 95% mean confidence interval for cycles value: -2174.88 -1220.01 95% mean confidence interval for cycles %-change: -2.00% 0.30% Inconclusive result (%-change mean confidence interval includes 0). total spills in shared programs: 20799 -> 20924 (0.60%) spills in affected programs: 843 -> 968 (14.83%) helped: 0 HURT: 4 total fills in shared programs: 27110 -> 27334 (0.83%) fills in affected programs: 1824 -> 2048 (12.28%) helped: 1 HURT: 4 total sends in shared programs: 1017935 -> 1017603 (-0.03%) sends in affected programs: 711 -> 379 (-46.69%) helped: 5 HURT: 0 helped stats (abs) min: 18 max: 87 x̄: 66.40 x̃: 74 helped stats (rel) min: 27.69% max: 54.72% x̄: 44.49% x̃: 44.31% 95% mean confidence interval for sends value: -101.79 -31.01 95% mean confidence interval for sends %-change: -58.42% -30.55% Sends are helped. Haswell and Ivy Bridge had similar results. (Haswell shown) total instructions in shared programs: 16397496 -> 16395411 (-0.01%) instructions in affected programs: 59384 -> 57299 (-3.51%) helped: 49 HURT: 0 helped stats (abs) min: 1 max: 208 x̄: 42.55 x̃: 39 helped stats (rel) min: 0.03% max: 8.18% x̄: 3.74% x̃: 3.91% 95% mean confidence interval for instructions value: -53.59 -31.51 95% mean confidence interval for instructions %-change: -4.24% -3.23% Instructions are helped. total cycles in shared programs: 1035483504 -> 1035397592 (<.01%) cycles in affected programs: 9379739 -> 9293827 (-0.92%) helped: 45 HURT: 4 helped stats (abs) min: 10 max: 5600 x̄: 2164.51 x̃: 2350 helped stats (rel) min: <.01% max: 11.61% x̄: 1.93% x̃: 1.56% HURT stats (abs) min: 2 max: 5756 x̄: 2872.75 x̃: 2866 HURT stats (rel) min: <.01% max: 24.65% x̄: 12.29% x̃: 12.26% 95% mean confidence interval for cycles value: -2293.06 -1213.56 95% mean confidence interval for cycles %-change: -2.42% 0.88% Inconclusive result (%-change mean confidence interval includes 0). total spills in shared programs: 17672 -> 17803 (0.74%) spills in affected programs: 364 -> 495 (35.99%) helped: 2 HURT: 2 total fills in shared programs: 20752 -> 20937 (0.89%) fills in affected programs: 656 -> 841 (28.20%) helped: 2 HURT: 2 total sends in shared programs: 1044703 -> 1044450 (-0.02%) sends in affected programs: 523 -> 270 (-48.37%) helped: 5 HURT: 0 helped stats (abs) min: 9 max: 87 x̄: 50.60 x̃: 37 helped stats (rel) min: 25.71% max: 54.72% x̄: 43.49% x̃: 42.53% 95% mean confidence interval for sends value: -93.95 -7.25 95% mean confidence interval for sends %-change: -58.48% -28.50% Sends are helped. No changes on Gen6 or earlier GPUs. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8852>	2021-02-08 00:25:22 +00:00
Ian Romanick	6b0443a900	nir/algebraic: Fix a >> #b << #b for sizes other than 32-bit The base mask previously used was 0xffffffff. This is not correct (but should still work) for 16-bit and 8-bit values, but it means the high 32-bits of 64-bit values will get chopped off. Instead of just restricting the pattern to 32-bits (as was done before `00b28a50b2`), this extends the optimization in two ways: 1. Make it correct for other bit sizes. 2. Make it work for arbitrary shift counts. This has the added benefit of reducing the number of patterns actually added (7 previously, 4 now). The "Reassociate for improved CSE" part is just reverted to its pre-00b28a50b2c behavior. I doubt that pattern is likely to have much impact outside 32-bits. This change fixes the piglit tests tests/spec/arb_gpu_shader_int64/fs-shl-of-shr-int64.shader_test and tests/spec/arb_gpu_shader_int64/fs-iand-of-iadd-int64.shader_test. All of the shaders helped in shader-db are vertex shaders on platforms with vector-oriented vertex processing. The shaders contain ((x >> 16) << 16). These platforms set lower_extract_word, so the optimization that transforms (x >> 16) to extract_u16 doesn't trigger. With only ~60 shaders involved, I didn't bother trying to add extract_XYZ versions of these patterns to try to get those cases. Fixes: `00b28a50b2` ("nir/algebraic: trivially enable existing 32-bit patterns for all bit sizes") Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Haswell and earlier Intel GPUs had simlar results. (Haswell shown) total instructions in shared programs: 16397554 -> 16397496 (<.01%) instructions in affected programs: 7961 -> 7903 (-0.73%) helped: 58 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.36% max: 1.89% x̄: 0.99% x̃: 0.78% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -1.13% -0.85% Instructions are helped. total cycles in shared programs: 1035483770 -> 1035483504 (<.01%) cycles in affected programs: 75922 -> 75656 (-0.35%) helped: 44 HURT: 2 helped stats (abs) min: 2 max: 12 x̄: 6.14 x̃: 2 helped stats (rel) min: 0.05% max: 1.67% x̄: 0.87% x̃: 0.72% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.06% max: 0.06% x̄: 0.06% x̃: 0.06% 95% mean confidence interval for cycles value: -7.28 -4.29 95% mean confidence interval for cycles %-change: -1.03% -0.63% Cycles are helped. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8852>	2021-02-08 00:25:22 +00:00
Alyssa Rosenzweig	083843de1e	nir/lower_io: Fix grammar errors Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8846>	2021-02-04 11:45:26 +00:00
Caio Marcelo de Oliveira Filho	a2414ada87	nir: Add nir_zero_initialize_shared_memory Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8708>	2021-02-02 17:06:56 +00:00
Jason Ekstrand	774fae34f0	nir: Drop the lower_mem_constant_vars declaration The function was removed in `c730ace12b`. Fixes: `c730ace12b` "nir,clover: Drop nir_lower_mem_constant_vars" Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8834>	2021-02-02 16:34:22 +00:00
Jason Ekstrand	f064b7a42c	nir: Add some ssa-only fast-paths for nir_src rewrite Basically every pass in NIR uses nir_ssa_def_rewrite_uses which calls nir_instr_rewrite_src which is fairly complex because it handles all sorts of non-SSA cases. Since we already know a priori that every source written by nir_ssa_def_rewrite_uses is SSA, we can check new_src once at the top of the function and cut out all that complexity. While we're at it, we expose a new SSA-only nir_ssa_def_rewrite_uses_ssa helper which takes an SSA def which avoids the one SSA check. It's also more convenient 90% of the time. Compile time as tested by Rhys Perry <pendingchaos02@gmail.com> Difference at 95.0% confidence -797.166 +/- 418.649 -0.566174% +/- 0.296441% (Student's t, pooled s = 325.459) Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8790>	2021-02-02 15:35:55 +00:00
Yevhenii Kolesnikov	a678ec9b8c	nir/from_ssa: don't check for interference within the same set Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8246>	2021-02-01 14:28:35 -06:00
Yevhenii Kolesnikov	fd05620e43	nir/from_ssa: consider defs in sibling blocks If def a and def b are in sibling blocks, the one with higher parent_instr's index does not necessarily come after the other. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3712 Fixes: `943ddb9458` "nir: Add a better out-of-SSA pass" Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8246>	2021-02-01 14:27:56 -06:00

1 2 3 4 5 ...

2997 commits