fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-22 11:20:11 +01:00

Author	SHA1	Message	Date
Jason Ekstrand	ff2f44d865	intel/fs: Implement nir_intrinsic_load_global_constant Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6379>	2020-09-01 20:50:04 +00:00
Jason Ekstrand	cccb497d3c	intel/fs: Fix MOV_INDIRECT and BROADCAST of Q types on Gen11+ The immediate case is pretty uncommon to see but it can happen, in theory. BROADCAST is typically used to uniformize values and those are usually 32-bit. However, it does come up in some subgroup ops. Fixes: `49c21802cb` "intel/compiler: Split has_64bit_types into float/int" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6211>	2020-09-01 13:25:20 -05:00
Karol Herbst	70cbddc4a7	nir: use enum operator helper for nir_variable_mode and nir_metadata those are used quite a bit Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6520>	2020-09-01 17:45:08 +00:00
Jason Ekstrand	4d18e71fea	nir: Rename num_shared to shared_size This one is always a size in bytes. Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6524>	2020-09-01 17:30:51 +00:00
Jason Ekstrand	e8b3bc1d55	intel/nir: Lower things with > 4 components in lower_mem_access_bit_sizes Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6502>	2020-08-31 17:04:40 +00:00
Jason Ekstrand	55ae704513	intel/fs: Add support for vec8 and vec16 ops Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6502>	2020-08-31 17:04:40 +00:00
Jason Ekstrand	b7db9ee320	intel/nir: Clean up lower_alpha_to_coverage a bit Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6233>	2020-08-29 16:41:05 +00:00
Jason Ekstrand	b6fdb1405e	intel/nir: Rewrite the guts of lower_alpha_to_coverage I have no idea how this pass ever worked. I guess it worked ok on the one or two piglit tests but the whole thing seemed very fragile. It makes a number of undocumented and unasserted assumptions and they aren't always valid. This rewrite makes a number of changes: 1. It now properly handles the case where the gl_SampleMask write comes before the gl_FragColor or gl_FragData[0] write. 2. It should early-exit faster because it now looks at bits in shader_info::outputs_written instead of looking for variables. 3. Instead of the fragile variable lookup where we try to look the variable up by both location and driver_location and match, we just use the driver_location calculations used by brw_fs_nir. 4. It asserts that the index parameter to store_output is a constant instead of silently failing if it isn't. 5. We now actually assert the implicit assumption that the two writes are in the same block. We go even further and assert that they are in the last block in the shader. 6. In the case where 3 or fewer components of the output are written, we explicitly choose to leave the sample mask alone. Fixes: `7ecfbd4f6d` "nir: Add alpha_to_coverage lowering pass" Closes: #3166 Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6233>	2020-08-29 16:41:05 +00:00
Jason Ekstrand	72dc06e07e	intel/nir: Pass the nir_builder by reference in lower_alpha_to_coverage I'm honestly not sure how passing a builder by-value ever worked. I guess the struct is mostly copyable. In any case, that's the wrong way to use it and it's causing issues. Fixes: `7ecfbd4f6d` "nir: Add alpha_to_coverage lowering pass" Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6233>	2020-08-29 16:41:05 +00:00
Jason Ekstrand	c84e2784eb	intel/nir: Allow splitting a single load into up to 32 loads Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6405>	2020-08-21 22:49:54 +00:00
Jason Ekstrand	febe762246	intel/fs: Fix an assert in load_scratch Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6405>	2020-08-21 22:49:54 +00:00
Jesse Natalie	d3faac7a15	nir: Add options to nir_lower_compute_system_values to control compute ID base lowering If no options are provided, existing intrinsics are used. If the lowering pass indicates there should be offsets used for global invocation ID or work group ID, then those instructions are lowered to include the offset. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5891>	2020-08-21 22:07:05 +00:00
Jesse Natalie	2e1df6a17f	nir: Move compute system value lowering to a separate pass The actual variable -> intrinsic lowering stays where it is, but ops which convert one intrinsic to be implemented in terms of another have moved. Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5891>	2020-08-21 22:07:05 +00:00
Karol Herbst	e5899c1e88	nir: rename nir_op_fne to nir_op_fneu It was always fneu but naming it fne causes confusion from time to time. So lets rename it. Later we also want to add other unordered and fne, this is a smaller preparation for that. Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6377>	2020-08-21 17:26:21 +00:00
Jason Ekstrand	1ccd681109	nir: Add an LOD parameter to image_*_size The OpenCL image_width/height/depth functions have variants which can take an LOD parameter. More importantly, LLVM-SPIRV-Translator always generates OpImageQuerySizeLod even if the LOD is guaranteed to be zero. Given that over half the hardware out there has an LOD field for image size queries (based on a rudimentary scan through their NIR -> whatever code), we may as well just add the source to the NIR intrinsic. If this is ever a problem for anyone, the lowering is pretty trivial. I've also added asserts to everyone's drivers that should alert them if they ever see an LOD other than zero. This will never happen with GL or Vulkan so there's no need for panic. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6396>	2020-08-20 20:48:10 +00:00
Louis-Francis Ratté-Boulianne	7dcb1d272f	st/mesa: Replace UsesStreams by ActiveStreamMask for GS Some drivers need to know which streams are used by a geometry shader. Adding a mask of active streams makes the use of UsesStreams superfluous as it's the equivalent of: ActiveStreamMask != (1 << 0) Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com> Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5984>	2020-08-18 11:17:26 +00:00
Jason Ekstrand	003b04e266	intel/compiler: Allow MESA_SHADER_KERNEL Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6280>	2020-08-12 10:11:06 +00:00
Caio Marcelo de Oliveira Filho	e2b6ccbdad	intel/compiler: Use C99 array initializers for prog_data/key sizes This is way better than a pile of STATIC_ASSERTs. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6280>	2020-08-12 10:11:06 +00:00
Jason Ekstrand	8e1de8e5ac	intel/cs_intrinsics: Handle 64-bit intrinsics It's safe to do the math in 32 bits because they're all local workgroup calculations. We just need to do a conversion at the end. For a couple of intrinsics, we just turn them into 32-bit intrinsics and add a u2u64. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6280>	2020-08-12 10:11:06 +00:00
Mark Janes	cf52b40fb0	intel/fs: work around gen12 lower-precision source modifier limitation GEN:BUG:1604601757 prevents source modifiers for multiplication of lower precision integers. lower_mul_dword_inst() splits 32x32 multiplication into 32x16, and needs to eliminate source modifiers in this case. Closes: #3329 Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2020-08-10 13:30:45 -07:00
Mark Janes	ee06e47c5b	intel/fs: Assert if lower_source_modifiers converts 32x16 to 32x32 multiplication Lowering source modifiers will convert a 16bit source to a 32bit value. In the case of integer multiplication, this will reverse previous lowering performed by lower_mul_dword_inst. Assert to prevent an illegal DxD operation (and GPU hang). Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2020-08-10 13:29:56 -07:00
Eric Anholt	023e6669cc	i965: Enable vector shrinking in the vec4 backend. This manages to make some extra vec operations that would turn into movs go away. brw shader-db: total instructions in shared programs: 3895037 -> 3893221 (-0.05%) total cycles in shared programs: 113832759 -> 113792154 (-0.04%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6050>	2020-08-03 21:26:45 +00:00
Matt Turner	c883c482be	intel/compiler: Relax SENDS regioning assertions The next commit fixes a mistake in the assembler and ends up running afoul of this assertion. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5956>	2020-07-31 12:59:24 -07:00
David Stevens	6c11a7994d	i965/i915: Add colorspace support to YUV sampling Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6122>	2020-07-31 07:27:03 +00:00
Boris Brezillon	025988f818	intel: Set int64_options to ~0 when lowering 64b ops That's more future proof than setting each bit manually. Looks like we already miss nir_lower_ufind_msb64 because of that. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5588>	2020-07-30 16:54:24 +00:00
Boris Brezillon	bfee35b45c	nir: Stop passing an options arg to nir_lower_int64() This information is exposed through shader->options->lower_int64_options. Removing the extra arg forces drivers to initialize this field correctly. This also allows us to check the int64 lowering options from each int64 lowering helper and decide if we should lower the instructions we introduce. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Rob Clark <robdclark@chromium.org> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5588>	2020-07-30 16:54:24 +00:00
Marcin Ślusarz	cb19fe24d3	intel/vec4: fix out of bounds read NIR_MAX_VEC_COMPONENTS was bumped from 4 to 16 in `a8ec4082` (2019.03.09, merged 2019.12.21) float[4] array was added in `acd7796a` (2019.06.11, merged 2019.07.11) Found by Coverity. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3014 Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Fixes: `a8ec4082a4` ("nir+vtn: vec8+vec16 support") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6067>	2020-07-30 10:41:00 +00:00
Jason Ekstrand	5c5555a862	nir: Add a find_variable_with_[driver_]location helper We've hand-rolled this loop 10 places and those are just the ones I found easily. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5966>	2020-07-29 17:38:58 +00:00
Jason Ekstrand	2956d53400	nir: Add nir_foreach_shader_in/out_variable helpers Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5966>	2020-07-29 17:38:57 +00:00
Francisco Jerez	4d73988f6f	intel/ir/gen12+: Work around FS performance regressions due to SIMD32 discard divergence. This avoids some performance regressions on Gen12 platforms caused by SIMD32 fragment shaders reported in titles like Dota2, TF2, Xonotic, and GFXBench5 Car Chase and Aztec Ruins. The most obvious pattern in the regressing shaders I identified among these workloads is that they all had non-uniform discard statements, which are handled rather optimistically by the current IR analysis pass: No penalty is currently applied to the SIMD32 variant of the shader in the form of differing branching weights like we do for other control flow instructions in order to account for the greater likelihood of divergence of a SIMD32 shader. Simply changing that by giving the same treatment to discard statements as we give to other branching instructions seemed to hurt more than it helped on platforms earlier than Gen12, since it reversed most of the improvement obtained from SIMD32 fragment shaders in Manhattan for no measurable benefit in other workloads (Manhattan has a handful of shaders with statically non-uniform discard statements which actually perform better in SIMD32 mode due to their approximate dynamic uniformity). For that reason this change is applied to Gen12+ platforms only. I've been running a number of tests trying to understand the difference in behavior between Gen12 and earlier platforms, and most of the evidence I've gathered seems to point at EU fusion being the culprit: Unlike previous generations, on Gen12 EUs are arranged in pairs which execute instructions in lockstep, giving an effective warp size of 64 threads in SIMD32 mode, which seems to increase the likelihood for control flow divergence in some of the affected shaders significantly. Fixes: `188a3659ae` "intel/ir: Import shader performance analysis pass." Reported-by: Caleb Callaway <caleb.callaway@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5910>	2020-07-23 01:40:06 +00:00
Jason Ekstrand	675d7b19a9	intel/fs: Use the correct logical op for global float atomics Fixes: `e644ed468f` "intel/fs: Implement nir_intrinsic_global_atomic_*" Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5992>	2020-07-21 05:01:34 +00:00
Yevhenii Kolesnikov	36abb0c691	intel/compiler: don't propagate cmp to add if add is saturated From the Kaby Lake PRM Vol. 7 "Assigning Conditional Flags": * Note that the [post condition signal] bits generated at the output of a compute are before the .sat. Paragraph about post_zero does not mention saturation, but testing it on actual GPUs shows that conditional modifiers are applied after saturation. * post_zero bit: This bit reflects whether the final result is zero after all the clamping, normalizing, or format conversion logic. For signed types we don't care about saturation: it won't change the result of conditional modifier. For floating and unsigned types there two special cases, when we can remove inst even if scan_inst is saturated: G and LE. Since conditional modifiers are just comparations against zero, saturating positive values to the upper limit never changes the result of comparation. For negative values: (sat(x) > 0) == (x > 0) --- false (sat(x) <= 0) == (x <= 0) --- true Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2610 Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4167>	2020-07-11 00:25:48 +00:00
Jordan Justen	8dfa072ed8	intel/compiler/fs: Still attempt simd32 when INTEL_DEBUG=no16 is used If INTEL_DEBUG=no16 is used, then simd16 will not be attempted. This, in turn prevents simd32 from running, because we attempt to skip simd32 when simd16 fails to compile. This change more accurately recognizes when we attempted simd16, but simd16 failed. One easy way to cause an issue is to set both no8 and no16. Before this change, we would be left with no FS program, even though simd32 could still be generated in some cases. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5269>	2020-07-09 15:44:57 -07:00
Jordan Justen	1a4a2f563b	intel/compiler/cs: Allow simd32 in some more cases with no8 and/or no16 If no16 was specified, and the shader can't run in simd8 due to the local_size, then we need to generate a simd32 program. If both no8 and no16 are specified, then we need to generate a simd32 program. Rework: * Drop update of `if` that would have changed `do32` to try simd32 even if simd16 spilled registers. (Caio) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5269>	2020-07-09 15:44:34 -07:00
Timothy Arceri	1a8f918050	intel/compiler: add and fix up fallthrough comments for gcc warnings Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5714>	2020-07-02 12:11:30 +10:00
Matt Turner	8da810a7fb	intel/compiler: Don't emit no-op cr0 changes If mask is 0, we're asking for no changes to cr0. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5566>	2020-07-02 01:24:06 +00:00
Matt Turner	fe14dc98bf	intel/compiler: Add assert that set bits are within mask We generate bitfields of bits that we want to retain (mask) and bits that we want to set (brw_mode) in the cr0 register, so the bits we want to set should be in the set of bits we want to retain. Also, remove the initialization of mask from fs_visitor::emit_shader_float_controls_execution_mode since brw_rnd_mode_from_nir initializes the mask parameter unconditionally. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5566>	2020-07-02 01:24:06 +00:00
Jason Ekstrand	561aaeeb48	intel/eu: Add the RNDU opcode We don't want to use it on gen5 and earlier because only RNDD can be done with a single instruction and we can implement RNDU(x) as -RNDD(-x) so it's better to just do that when we have the instruction. On gen6 and above, we may as well just use the right instruction. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5596>	2020-06-23 17:43:54 +00:00
Jason Ekstrand	e0ab48e3ea	intel/eu: Set the right subnr for ALIGN16 destinations Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5596>	2020-06-23 17:43:54 +00:00
Jason Ekstrand	8a0d772dca	intel/eu: Add a brw_urb_dest_msg_type helper Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5596>	2020-06-23 17:43:54 +00:00
Kenneth Graunke	2c762955d4	intel/eu: Add a brw_urb_desc helper Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5596>	2020-06-23 17:43:53 +00:00
Jason Ekstrand	ecda98fbb2	intel/compiler: Expose brw_texture_offset to C Some day we probably want to move it out of brw_shader if we're going to share it with IBC but that can be another day. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5596>	2020-06-23 17:43:53 +00:00
Jason Ekstrand	479797e130	intel/fs: Move more prog_data setup into populate_wm_prog_data Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5596>	2020-06-23 17:43:53 +00:00
Jason Ekstrand	fc519cad57	intel/fs: Break wm_prog_data setup into a helper Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5596>	2020-06-23 17:43:53 +00:00
Jason Ekstrand	2687ec5ee6	intel/fs: Expose a couple of NIR lowering helpers Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5596>	2020-06-23 17:43:53 +00:00
Arcady Goldmints-Orlov	04f77595f0	intel/compiler: Always apply sample mask on Vulkan. With OpenGL, shader writes to the sample mask are ignored when not rendering to a multisample render target. However, on Vulkan, writes to the sample mask have still have their effect in that case. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3016 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5156>	2020-06-19 20:24:11 -05:00
Sagar Ghuge	a0ef4971d0	intel/compiler: Remove unnecessary optimization for MUL 2 source instruction only support immediate for src1 operand, so no point in adding optimization condition for src0 oprand. v2: - Update commit message and don't remove ADD optimization (Matt Turner) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5341>	2020-06-16 17:11:32 -07:00
Sagar Ghuge	d4f3f9390f	intel/compiler: Optimize integer add with 0 into mov Kaby Lake total instructions in shared programs: 326560 -> 323616 (-0.90%) instructions in affected programs: 178062 -> 175118 (-1.65%) helped: 129 HURT: 0 helped stats (abs) min: 1 max: 118 x̄: 22.82 x̃: 8 helped stats (rel) min: 0.35% max: 6.56% x̄: 2.57% x̃: 2.47% 95% mean confidence interval for instructions value: -27.71 -17.93 95% mean confidence interval for instructions %-change: -2.81% -2.32% Instructions are helped. total cycles in shared programs: 43741127 -> 45397851 (3.79%) cycles in affected programs: 40880261 -> 42536985 (4.05%) helped: 94 HURT: 34 helped stats (abs) min: 5 max: 6160 x̄: 598.91 x̃: 45 helped stats (rel) min: 0.20% max: 34.86% x̄: 2.52% x̃: 1.09% HURT stats (abs) min: 1 max: 76198 x̄: 50383.00 x̃: 69677 HURT stats (rel) min: 0.07% max: 48.41% x̄: 15.65% x̃: 6.49% 95% mean confidence interval for cycles value: 8023.10 17863.21 95% mean confidence interval for cycles %-change: <.01% 4.60% Cycles are HURT. total spills in shared programs: 1086 -> 978 (-9.94%) spills in affected programs: 897 -> 789 (-12.04%) helped: 24 HURT: 0 total fills in shared programs: 1686 -> 1584 (-6.05%) fills in affected programs: 1371 -> 1269 (-7.44%) helped: 24 HURT: 0 v2: - Use brw_reg_type_is_integer (Matt Turner) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5341>	2020-06-16 16:54:27 -07:00
Matt Turner	66111bc95a	intel/compiler: Drop opt_sampler_eot() Gen9 and Cherryview have the ability to mark texture instructions with the End-of-thread bit under some conditions, which allows the texture result to be written to the render target directly, rather than returning to the EU. In order to handle overlapping primitives correctly, we have to use the 'sendc' instruction which stalls until other threads potentially writing to the same locations in the render target are retired. Unfortunately, this stall happens before the texture is sampled (rather than in parallel with stall), so for some literal edge cases (like the diagonal edge between two triangles forming a rectangle) there can be a performance penalty. As a result, it's probably not a good idea to use this optimization in general. I had planned to leave it enabled only for BLORP, where we use rectangle primitives and are typically clearing/blitting an entire render target without any overlapping primitives, but I noticed that the optimization wasn't applied in some normal cases anyway. For example, in the piglit test tests/shaders/glsl-fs-texture2d-bias.shader_test it is applied to one BLORP-blit shader but not another due to some kind of mishandling of register types (the destination register type of the texture operation is UD while the color source of the render target write is F). Additionally the instruction scheduler assumed that the combined texture and render target write operation took 0 cycles, leading to cycle estimates that are wildly inaccurate. Since the optimization was not implemented for SIMD32 and our decision whether to use the SIMD32 program is made by comparing the estimated performance with that of the SIMD16 shader, we wrongly threw out a bunch of SIMD32 programs that are likely profitable. total cycles in shared programs: 472807891 -> 473784245 (0.21%) cycles in affected programs: 108277 -> 1084631 (901.72%) helped: 0 HURT: 1290 total sends in shared programs: 998955 -> 1000245 (0.13%) sends in affected programs: 1400 -> 2690 (92.14%) helped: 0 HURT: 1290 LOST: 0 GAINED: 33 This patch shows no performance changes in Intel's Mesa performance CI. Given the problems, the lack of evidence that the pass improves performance, and the fact that the hardware feature was removed from subsequent GPU generations, I think that the pass is not valuable and should be removed. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5412>	2020-06-12 19:01:26 +00:00
Jason Ekstrand	92cfbb7d0c	intel/nir: Call nir_metadata_preserve on !progress Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5171>	2020-06-11 05:08:12 +00:00

1 2 3 4 5 ...

1487 commits