fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-18 13:48:06 +02:00

Author	SHA1	Message	Date
Gert Wollny	6c9495b392	nir: make nir_get_texture_size/lod available outside nir_lower_tex This functions can be useful in other places. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3286>	2020-01-04 16:22:40 +00:00
Bas Nieuwenhuizen	96c9483ccf	spirv: Fix glsl type assert in spir2nir. Fixes: `624789e370` "compiler/glsl: handle case where we have multiple users for types" Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2020-01-04 15:53:24 +00:00
Erik Faye-Lund	d9ff5f0414	nir/zink: move clip_halfz-lowering to common code Etnaviv also does the same thing, so let's try to avoid repetition here, and use the same for it code as well. Reviewed-by: Jonathan Marek <jonathan@marek.ca> Tested-by: Paul Cercueil <paul@crapouillou.net>	2020-01-03 22:48:19 +00:00
Kenneth Graunke	19ed12afd1	st/nir: Optionally unify inputs_read/outputs_written when linking. i965 and iris use inputs_read/outputs_written for a shader stage to determine the layout of input and output storage. Adjacent stages must agree on the layout, so adjacent input/output bitfields must match. This patch adds a new nir_shader_compiler_options::unify_interfaces flag which asks the linker to unify the input/output interfaces between adjacent stages. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3249>	2020-01-03 00:41:50 +00:00
Bas Nieuwenhuizen	59c4fb9d72	nir: print non-uniform tex fields. To ease debugging in the future. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3246> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3246>	2020-01-02 11:42:33 +01:00
Bas Nieuwenhuizen	69bdc1c5fc	nir: Add clone/hash/serialize support for non-uniform tex instructions. These were missed when the fields got added. Added it everywhere where texture_index got used and it made sense. Found this in "The Surge 2", where the inliner does not copy the fields, resulting in corruption and hangs. Fixes: `3bd5457641` "nir: Add a lowering pass for non-uniform resource access" Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1203 Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3246>	2020-01-02 11:41:33 +01:00
Alyssa Rosenzweig	ddc5a371b3	glsl: Set .flat for gl_FrontFacing It is a boolean. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3237> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3237>	2019-12-30 19:54:50 +00:00
Mauro Rossi	200be80858	android: nir: add a load/store vectorization pass Fixes the following aco building error: external/mesa/src/amd/compiler/aco_instruction_selection_setup.cpp:846: error: undefined reference to 'nir_opt_load_store_vectorize' Fixes: `ce9205c` ("nir: add a load/store vectorization pass") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-27 09:20:35 +01:00
Dave Airlie	41c77dbc1e	nir: sanitize work group intrinsics to always be 32-bit. This saves handling them in the backend later. Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-12-27 13:22:34 +10:00
Rob Clark	a8ec4082a4	nir+vtn: vec8+vec16 support This introduces new vec8 and vec16 instructions (which are the only instructions taking more than 4 sources), in order to construct 8 and 16 component vectors. In order to avoid fixing up the non-autogenerated nir_build_alu() sites and making them pass 16 src args for the benefit of the two instructions that take more than 4 srcs (ie vec8 and vec16), nir_build_alu() is has nir_build_alu_tail() split out and re-used by nir_build_alu2() (which is used for the > 4 src args case). v2 (Karol Herbst): use nir_build_alu2 for vec8 and vec16 use python's array multiplication syntax add nir_op_vec helper simplify nir_vec nir_build_alu_tail -> nir_builder_alu_instr_finish_and_insert use nir_build_alu for opcodes with <= 4 sources v3 (Karol Herbst): fix nir_serialize v4 (Dave Airlie): fix serialization of glsl_type handle vec8/16 in lowering of bools v5 (Karol Herbst): fix load store vectorizer Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-12-21 11:00:17 +00:00
Karol Herbst	c83b1a4560	nir/serialize: cast swizzle before shifting fixes undefined behaviour with enabled vec16 Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-12-21 11:00:16 +00:00
Caio Marcelo de Oliveira Filho	4fbc99c124	spirv: Implement SPV_KHR_non_semantic_info Do nothing for OpExtInst from extended instruction sets that name start with "NonSemantic.". Since they can be used within the "preamble" to annotate global decorations, also don't stop iterating when one of them is found. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3154> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3154>	2019-12-19 22:49:39 -08:00
Jonathan Marek	06ae0674fd	nir: fix assign_io_var_locations for vertex inputs Also fixes fragment inputs using the wrong "base" value (which was working only because FRAG_RESULT_DATA0 is less than VARYING_SLOT_VAR0) Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3108> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3108>	2019-12-19 21:26:52 +00:00
Juan A. Suarez Romero	7f821289cb	Revert "nir/lower_double_ops: relax lower mod()" This reverts commit `8172b1fa03`. This commit was done taking in account Vulkan spec, but did not realize it was affecting OpenGL too. Closes: #2252	2019-12-19 20:01:16 +01:00
Juan A. Suarez Romero	8172b1fa03	nir/lower_double_ops: relax lower mod() Currently when lowering mod() we add an extra instruction so if mod(a,b) == b then 0 is returned instead of b, as mathematically mod(a,b) is in the interval [0, b). But Vulkan spec has relaxed this restriction, and allows the result to be in the interval [0, b]. This commit takes this in account to remove the extra instruction required to return 0 instead. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2922> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2922>	2019-12-19 12:36:30 +00:00
Jonathan Marek	004797002f	nir: add option to lower half packing opcodes Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3106>	2019-12-16 19:20:07 -05:00
Iago Toral Quiroga	6c7a2b69f8	v3d: handle writes to gl_Layer from geometry shaders When geometry shaders write a value to gl_Layer that doesn't correspond to an existing layer in the target framebuffer the rendering behavior is undefined according to the spec, however, there are CTS tests that trigger this scenario on purpose, probably to ensure that nothing terrible happens. For V3D, this situation is problematic because the binner uses the layer index to select the offset to write into the tile state data, and we only allocate tile state for MAX2(num_layers, 1), so we want to make sure we don't produce values that would lead to out of bounds writes. The simulator has an assert to catch this, although we haven't observed issues in actual hardware it is probably best to play safe. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-16 08:42:37 +01:00
Alejandro Piñeiro	2865d79a33	nir/opt_peephole_select: remove unused variables To avoid "unused variable" warnings. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-12-13 17:14:58 +01:00
Timothy Arceri	a6aedc662e	st/glsl_to_nir: use nir based program resource list builder Here we use the NIR based builder to add everything to the resource list execpt for SSO packed varyings. Since the details of those varyings get lost during packing we leave the special handing to the GLSL IR pass for now. In order to do this we add some bools to the build resource list functions. Using the NIR based resource list builder gets us a step closer to using a native NIR based linker. It should also be faster than the GLSL IR builder, one because the NIR optimisations should mean we add less entries due to better optimisations, and two because nir gives us better lists to work with and we don't need to walk the entire IR to find the resources. Ack-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-13 00:07:19 +00:00
Timothy Arceri	d0259f4159	glsl: add subroutine support to nir_build_program_resource_list() This is required so we can use the NIR linker to link GLSL in addition to spirv. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-13 00:07:19 +00:00
Timothy Arceri	46f9f74c57	glsl: add support for named varyings in nir_build_program_resource_list() This adds support for adding names of varying to the resource list which is required for us to use this function with the glsl linker. Support for names is optional for spirv which is why it had not been added yet. This is mostly a copy of the GLSL IR code adapted to nir. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-13 00:07:19 +00:00
Timothy Arceri	3c364f90fd	glsl: copy the new data fields when converting to nir These fields added in the previous commit will be used to make use of a NIR based GLSL linker. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-13 00:07:19 +00:00
Timothy Arceri	56c25b938c	nir: add some fields to nir_variable_data These will be used to provide NIR linking functionality to GLSL. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-13 00:07:19 +00:00
Timothy Arceri	89b2b0f767	glsl: copy the how_declared field when converting to nir This is needed to make use of nir_build_program_resource_list(). Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-13 00:07:19 +00:00
Timothy Arceri	c3823d2d29	glsl: move nir_remap_dual_slot_attributes() call out of glsl_to_nir() In order to be able to implement a NIR based glsl linker we need to build the program resource list with NIR. This change delays the remaping so that a later commit can call the NIR based resource list builder. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-13 00:07:19 +00:00
Tomeu Vizoso	fb579b0347	nir: Don't copy empty array It's undefined behavior UBSAN complains about, so fixing this will reduce the noise a bit. ../src/compiler/nir/nir_clone.c:710:4: runtime error: null pointer passed as argument 2, which is declared to never be null"} #0 0xac781be4 in clone_function ../src/compiler/nir/nir_clone.c:710"} #1 0xac781be4 in nir_shader_clone ../src/compiler/nir/nir_clone.c:740"} #2 0xacf99442 in panfrost_shader_compile ../src/gallium/drivers/panfrost/pan_assemble.c:54"} #3 0xacf6b268 in panfrost_bind_shader_state ../src/gallium/drivers/panfrost/pan_context.c:1960"} #4 0xaae326bc in set_fragment_shader ../src/mesa/state_tracker/st_cb_clear.c:135"} #5 0xaae326bc in clear_with_quad ../src/mesa/state_tracker/st_cb_clear.c:335"} #6 0xaae326bc in st_Clear ../src/mesa/state_tracker/st_cb_clear.c:518"} #7 0x494d0e in deqp::gles2::TestCaseWrapper::iterate(tcu::TestCase) (/deqp/modules/gles2/deqp-gles2+0x2ad0e)"} #8 0x7f9cf2 in tcu::TestSessionExecutor::iterateTestCase(tcu::TestCase) (/deqp/modules/gles2/deqp-gles2+0x38fcf2)"} #9 0x7fa5f0 in tcu::TestSessionExecutor::iterate() (/deqp/modules/gles2/deqp-gles2+0x3905f0)"} #10 0x7e1aac in tcu::App::iterate() (/deqp/modules/gles2/deqp-gles2+0x377aac)"} #11 0x492d4c in main (/deqp/modules/gles2/deqp-gles2+0x28d4c)"} #12 0xb64b9aa8 in __libc_start_main (/lib/arm-linux-gnueabihf/libc.so.6+0x1aaa8)"} Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-12 16:26:45 +01:00
Dave Airlie	790bc9a17e	vtn/opencl: add shuffle/shuffle support This adds nir encoding for these, generating them from libclc was very expensive, and this is a lot simpler. Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-12-12 19:40:58 +10:00
Dave Airlie	5471ef7532	vtn: convert vload/store to single value loops There is an alignment issue doing this the other way, the spec clearly says vload/store don't require alignment. Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-12-12 19:40:15 +10:00
Karol Herbst	70c6bff2f0	nir: handle nir_deref_type_ptr_as_array in rematerialize_deref_in_block I forgot why that was required, but it still is the correct thing to do. Hit it at some point when working on implementing more CL features. Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-12-11 23:54:39 +00:00
Rob Clark	ddb9701a3c	spirv: add OpLifetime* These are just hints so we can ignore them. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2019-12-11 23:54:39 +00:00
Karol Herbst	2402232c90	spirv: handle UniformConstant for OpenCL kernels The caller is responsible for setting up the ubo_addr_format value as contrary to shared and global, it's not controlled by the spirv. Right now clovers implementation of CL constant memory uses a 24/8 bit format to encode the buffer index and offset, but that code is dead as all backends treat constants as global memory to workaround annoying issues within OpenCL. Maybe that will change, maybe not. But just in case somebody wants to look at it, add a toggle for this inside vtn. Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-12-11 23:54:39 +00:00
Karol Herbst	0bafde717d	nir/tests: MSVC build fix Fixes: `11f736a6f9` "nir/tests: add serializer tests" Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-12-11 17:12:48 +00:00
Karol Herbst	11f736a6f9	nir/tests: add serializer tests Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-12-11 13:00:44 +01:00
Karol Herbst	676232d76f	nir/serialize: fix vec8 and vec16 Nir serializes uses nir_ssa_alu_instr_src_components in a few places to determine how many components a src has, but that's not what this function returns. It simply returns how many channels are used, which is still fine for most of the code. This was breaking code like this: vec16 32 ssa_1 = intrinsic load_global vec1 32 ssa_2 = fmax ssa_1.a, ssa_2.b v2: make the 16bit encoding work for identify swizzles again Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-12-11 13:00:44 +01:00
Pierre Moreau	8ccd3f48a0	compiler/spirv: Fix uses of gnu struct = {} extension Fixes: `a24d6fbae6` ("meson: Add -Werror=gnu-empty-initializer to MSVC compat args") Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Tested-by: Vinson Lee <vlee@freedesktop.org> Signed-off-by: Pierre Moreau <dev@pmoreau.org>	2019-12-11 06:03:22 +00:00
Timothy Arceri	1abca2b3c8	glsl/nir: iterate the system values list when adding varyings Iterate the system values list when adding varyings to the program resource list in the NIR linker. This is needed to avoid CTS regressions when using the NIR to build the GLSL resource list in an upcoming series. Presumably it also fixes a bug with the current ARB_gl_spirv support. Fixes: `ffdb44d3a0` ("nir/linker: Add inputs/outputs to the program resource list") Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-05 22:04:31 +00:00
Michel Dänzer	f6a913bb95	glsl/tests: Use splitlines() instead of strip() strip() removes leading and trailing newlines, but leaves newlines between multiple lines in the string. This could cause failures when comparing the output of cross-compiled Windows binaries (producing Windows-style newlines) to the expected output with Unix-style newlines. Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-12-05 12:31:17 +01:00
Timothy Arceri	1b1b436fa7	glsl: make use of active_shader_mask when building resource list This allows us to avoid walking the entire IR looking for used uniforms. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-12-05 13:18:30 +11:00
Timothy Arceri	f0cb0fe1c0	glsl: don't set uniform block as used when its not The spec requires unused uniform block to be set as active in the program resource list. To support this we tell opt dead code not to remove them. However we can mark them as unused internally and avoid unnecessarily state changes. This change is also required for the folowing clean-up patch. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-12-05 13:18:23 +11:00
Timothy Arceri	50dc4b77f6	glsl: move calculate_array_size_and_stride() to link_uniforms.cpp This is where all the other uniform values are populated so it makes much more sense here. Moving it will also allow us to better share code between the NIR and GLSL IR resource list builders. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-12-05 13:18:02 +11:00
Rob Clark	372ed42d22	nir/lower_clip: Fix incorrect driver loc for clipdist outputs Somehow adjusting maxloc based on existing outputs got lost, resulting in the clipdist varying clobbering the position varying. Causing a shader that had no position output in freedreno/ir3, which triggers GPU hangs in neverball. Fixes: `d0f746b645` ("nir: Save nir_variable pointers in nir_lower_clip_vs rather than locs.") Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-04 13:08:52 -08:00
Tapani Pälli	272ef5d39a	glsl: additional interface redeclaration check for SSO programs Patch adds additional linker check for SSO programs to make sure they are redeclaring built-in blocks as required by the desktop spec. This fixes following Piglit tests: arb_separate_shader_objects/linker/pervertex-* Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-12-04 15:27:41 +00:00
Rhys Perry	3e67aa2e4e	nir/load_store_vectorize: fix combining stores with aliasing loads between v2: add test Fixes: `ce9205c03b` ('nir: add a load/store vectorization pass') Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> (v1) Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v2)	2019-12-04 12:21:40 +00:00
Ian Romanick	fbd5359a0a	nir/algebraic: Rearrange bcsel sequences generated by nir_opt_peephole_select Reviewed-by: Matt Turner <mattst88@gmail.com> All Intel platforms had similar results. (Ice Lake shown) total instructions in shared programs: 14660366 -> 14653437 (-0.05%) instructions in affected programs: 316166 -> 309237 (-2.19%) helped: 905 HURT: 10 helped stats (abs) min: 1 max: 36 x̄: 7.67 x̃: 6 helped stats (rel) min: 0.13% max: 18.75% x̄: 4.28% x̃: 3.60% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.10% max: 1.33% x̄: 0.70% x̃: 0.97% 95% mean confidence interval for instructions value: -7.91 -7.23 95% mean confidence interval for instructions %-change: -4.46% -3.99% Instructions are helped. total cycles in shared programs: 228571646 -> 228549759 (<.01%) cycles in affected programs: 56239919 -> 56218032 (-0.04%) helped: 681 HURT: 216 helped stats (abs) min: 1 max: 5156 x̄: 45.49 x̃: 10 helped stats (rel) min: <.01% max: 10.45% x̄: 1.29% x̃: 0.65% HURT stats (abs) min: 1 max: 320 x̄: 42.09 x̃: 14 HURT stats (rel) min: <.01% max: 37.04% x̄: 1.38% x̃: 0.49% 95% mean confidence interval for cycles value: -41.51 -7.29 95% mean confidence interval for cycles %-change: -0.80% -0.49% Cycles are helped. LOST: 1 GAINED: 0	2019-12-02 16:46:20 -08:00
Ian Romanick	780b5c1037	nir/algebraic: Simplify some Inf and NaN avoidance code Since a is non-negative, neither fsqrt nor frsq should return NaN. frsq should only return Inf when fsqrt returns 0. The changes are pretty small, but this turns a few hundred hurt shaders in the next patch into helped shaders. An alternative to the intBitsToFloat is to import numpy and do np.finfo(np.float32).max. That's more explicit, but we may also want to have specific bit encodings of float values later. I could be convinced either way, but intBitsToFloat(0x7f7fffff) was what I implemented first. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com> All Gen7+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 14661140 -> 14661104 (<.01%) instructions in affected programs: 7520 -> 7484 (-0.48%) helped: 36 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.32% max: 0.61% x̄: 0.49% x̃: 0.52% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.52% -0.47% Instructions are helped. total cycles in shared programs: 228585416 -> 228584806 (<.01%) cycles in affected programs: 56321 -> 55711 (-1.08%) helped: 32 HURT: 0 helped stats (abs) min: 2 max: 98 x̄: 19.06 x̃: 10 helped stats (rel) min: 0.08% max: 6.41% x̄: 1.09% x̃: 0.65% 95% mean confidence interval for cycles value: -28.32 -9.80 95% mean confidence interval for cycles %-change: -1.63% -0.54% Cycles are helped. Sandy Bridge total cycles in shared programs: 152991077 -> 152991075 (<.01%) cycles in affected programs: 11525 -> 11523 (-0.02%) helped: 2 HURT: 2 helped stats (abs) min: 2 max: 4 x̄: 3.00 x̃: 3 helped stats (rel) min: 0.07% max: 0.11% x̄: 0.09% x̃: 0.09% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.08% max: 0.08% x̄: 0.08% x̃: 0.08% 95% mean confidence interval for cycles value: -5.27 4.27 95% mean confidence interval for cycles %-change: -0.16% 0.15% Inconclusive result (value mean confidence interval includes 0). No changes on Iron Lake or GM45.	2019-12-02 16:46:20 -08:00
Ian Romanick	e342d6970b	nir/opt_peephole_select: Don't count some unary operations In many cases, fsat, fneg, fabs, ineg, and iabs will get folded into another instruction as either source or destination modifiers. Counting them as instructions means that some if-statements won't get converted to selects. For example, vec1 32 ssa_25 = flt32 ssa_0, ssa_23.x /* succs: block_1 block_2 / if ssa_25 { block block_1: / preds: block_0 / vec1 32 ssa_26 = fabs ssa_24 vec1 32 ssa_27 = fneg ssa_26 vec1 32 ssa_28 = fabs ssa_20 vec1 32 ssa_29 = fneg ssa_28 vec1 32 ssa_30 = fmul ssa_27, ssa_29 vec1 32 ssa_31 = fsat ssa_30 / succs: block_3 / } else { block block_2: / preds: block_0 / / succs: block_3 / } block block_3: / preds: block_1 block_2 */ block_1 isn't really 6 instructions, but it will be counted that way. Most callers of the peephole_select pass use either 1 or 8. It's very easy to blow way past either of these limits with things that are really only one or two actual instructions. I also tried some fancier things like making sure the fsat was of another SSA def from the same block, but the simple test was actually better. The i965 back-end SEL peephole pass still helps ~700 shaders in shader-db with this change. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com> All Gen6+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 14743694 -> 14738910 (-0.03%) instructions in affected programs: 156575 -> 151791 (-3.06%) helped: 1204 HURT: 0 helped stats (abs) min: 1 max: 27 x̄: 3.97 x̃: 3 helped stats (rel) min: 0.15% max: 19.57% x̄: 5.15% x̃: 4.55% 95% mean confidence interval for instructions value: -4.12 -3.82 95% mean confidence interval for instructions %-change: -5.35% -4.95% Instructions are helped. total cycles in shared programs: 231749141 -> 231602916 (-0.06%) cycles in affected programs: 2818975 -> 2672750 (-5.19%) helped: 876 HURT: 322 helped stats (abs) min: 2 max: 788 x̄: 180.99 x̃: 220 helped stats (rel) min: <.01% max: 43.82% x̄: 20.75% x̃: 19.44% HURT stats (abs) min: 1 max: 1188 x̄: 38.27 x̃: 20 HURT stats (rel) min: 0.09% max: 102.67% x̄: 5.17% x̃: 1.70% 95% mean confidence interval for cycles value: -130.47 -113.64 95% mean confidence interval for cycles %-change: -14.85% -12.72% Cycles are helped. total sends in shared programs: 730495 -> 730491 (<.01%) sends in affected programs: 46 -> 42 (-8.70%) helped: 2 HURT: 0 Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8122757 -> 8122617 (<.01%) instructions in affected programs: 14716 -> 14576 (-0.95%) helped: 46 HURT: 1 helped stats (abs) min: 1 max: 8 x̄: 3.07 x̃: 3 helped stats (rel) min: 0.36% max: 10.00% x̄: 2.54% x̃: 1.06% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 1.59% max: 1.59% x̄: 1.59% x̃: 1.59% 95% mean confidence interval for instructions value: -3.42 -2.54 95% mean confidence interval for instructions %-change: -3.28% -1.62% Instructions are helped. total cycles in shared programs: 188510100 -> 188509780 (<.01%) cycles in affected programs: 58994 -> 58674 (-0.54%) helped: 32 HURT: 1 helped stats (abs) min: 2 max: 96 x̄: 10.06 x̃: 6 helped stats (rel) min: 0.05% max: 15.29% x̄: 1.37% x̃: 0.31% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.68% max: 0.68% x̄: 0.68% x̃: 0.68% 95% mean confidence interval for cycles value: -16.34 -3.06 95% mean confidence interval for cycles %-change: -2.46% -0.15% Cycles are helped.	2019-12-02 16:46:19 -08:00
Rhys Perry	5404b7aaa3	nir/lower_io_to_vector: don't create arrays when not needed Some backends require that there are no array varyings. If there were no arrays in the input shader, the pass shouldn't have to create new ones. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2103 Fixes: `bcd14756ee` ('nir/lower_io_to_vector: add flat mode') Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-12-02 17:45:01 +00:00
Dave Airlie	3e21e17b2f	nir/samplers: don't zero samplers_used/txf. This allows this pass to be run multiple times and the results are just or'ed together. It fixes on test on llvmpipe nir, and regresses none. Suggested by Kenneth Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-12-02 09:15:55 +10:00
Tapani Pälli	d61a21f439	glsl: handle max uniform limits with lower_const_arrays_to_uniforms Fixes arb_tessellation_shader-large-uniforms Piglit test. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-11-28 14:11:46 +02:00
Kenneth Graunke	9b577f2a88	driconf, glsl: Add a vs_position_always_invariant option Many applications use multi-pass rendering and require their vertex shader position to be computed the same way each time. Optimizations may consider, say, fusing a multiply-add based on global usage of an expression in a shader. But a second shader with the same expression may have different code, causing that optimization to make the other choice the second time around. The correct solution is for applications to mark their VS outputs 'invariant', indicating they need multiple shaders to compute that output in the same manner. However, most applications fail to do so. So, we add a new driconf option - vs_position_always_invariant - which forces the gl_Position output in vertex shaders to be marked invariant. Fixes: `7025dbe794` ("nir: Skip emitting no-op movs from the builder.") Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-11-27 18:48:04 +00:00

1 2 3 4 5 ...

4387 commits