fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-21 18:00:13 +01:00

Author	SHA1	Message	Date
Kenneth Graunke	78a195f252	intel/compiler: Postpone most int64 lowering to brw_postprocess_nir Float conversions continue to be lowered early at the same time as nir_lower_doubles, which we run early so we don't have to run it for every shader key variant. However, all other int64 lowering is now done late, after nir_opt_load_store_vectorize(), allowing it to comprehend basic arithmetic on 64-bit addresses. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23064>	2023-05-18 10:48:50 +00:00
Alyssa Rosenzweig	01e9ee79f7	nir: Drop unused name from nir_ssa_dest_init Since `624e799cc3` ("nir: Drop nir_ssa_def::name and nir_register::name"), SSA defs don't have names, making the name argument unused. Drop it from the signature and fix the call sites. This was done with the help of the following Coccinelle semantic patch: @@ expression A, B, C, D, E; @@ -nir_ssa_dest_init(A, B, C, D, E); +nir_ssa_dest_init(A, B, C, D); Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23078>	2023-05-17 23:46:16 +00:00
Alyssa Rosenzweig	c323762f9f	treewide: Stop lowering legacy atomics There are no more producers of legacy atomics so these calls are inert. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23036>	2023-05-16 22:36:21 +00:00
Lionel Landwerlin	952a523abb	intel: switch over to unified atomics Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23004>	2023-05-15 16:32:21 +00:00
Kenneth Graunke	f00143acc3	intel/compiler: Fold constants after distributing source modifiers This can generate things like fneg! of load_const, which is silly. Fold those away into an actual constant. Only do so on the scalar backend because there's a comment above that the vec4 backend doesn't want any new constants this late, and I'm inclined to believe it. fossil-db stats show a very minor improvement: Totals: Instrs: 203091223 -> 203091099 (-0.00%); split: -0.00%, +0.00% Cycles: 14410638075 -> 14410577067 (-0.00%); split: -0.00%, +0.00% Totals from 20 (0.00% of 665070) affected shaders: Instrs: 27067 -> 26943 (-0.46%); split: -0.47%, +0.01% Cycles: 2687958 -> 2626950 (-2.27%); split: -2.27%, +0.00% Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22881>	2023-05-09 00:16:40 -07:00
Ian Romanick	d47f521ee4	intel/compiler: Use NIR_PASS instead of NIR_PASS_V Reduce debug log spam by only logging the shader if a pass made some changes. This can also elide some nir_validate calls in debug builds. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22299>	2023-04-06 19:07:50 +00:00
Emma Anholt	f1ea6c1b40	intel: Always call nir_lower_frexp. We have NIR lowering for Vulkan, and rely on GLSL's lowering in the frontend, but this will let us drop the GLSL lowering. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22083>	2023-04-06 02:32:01 +00:00
Lionel Landwerlin	e25aee8e34	intel/fs: also allow vec8+ vectorization of load_global_const_block_intel Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21853>	2023-04-05 12:32:56 +00:00
Lionel Landwerlin	a358b97c58	intel/fs: optimize uniform SSBO & shared loads Using divergence analysis, figure out when SSBO & shared memory loads are uniform and carry the data only once in register space. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21853>	2023-04-05 12:32:56 +00:00
Patrick Lerda	5d85966805	intel: fix memory leak related to brw_nir_create_passthrough_tcs() Indeed, the parameter "mem_ctx" was not processed. For instance, this issue is triggered with the crocus driver and "piglit/bin/shader_runner tests/spec/arb_tessellation_shader/execution/compatibility/tes-clip-vertex-different-from-position.shader_test -auto -fbo": SUMMARY: AddressSanitizer: 235216 byte(s) leaked in 48 allocation(s). Fixes: `96ba0344db` ("intel: Use common helpers for TCS passthrough shaders") Signed-off-by: Patrick Lerda <patrick9876@free.fr> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22173>	2023-03-30 10:52:07 +00:00
Marcin Ślusarz	32107d8b5a	intel/compiler: compactify locations of mesh outputs Needed in support of anv code for Wa_14015590813. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17622>	2023-03-29 18:35:55 +00:00
Tapani Pälli	6538c5bcd4	intel/fs: restore message layout changes for cube array This reverts commit `bc04e2daca` that handled the change as a WA while this is about a new feature, change done in message layout. Patch also changes the original comment to not refer to Wa but bspec page. Fixes: `bc04e2daca` ("intel/fs: use generated helpers for Wa_1209978020 / Wa_18012201914") Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Mark Janes <markjanes@swizzler.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22068>	2023-03-22 20:18:11 +00:00
Lionel Landwerlin	56474fae93	intel/fs: fix subgroup invocation read bounds checking nir->info.subgroup_size can be set to an enum : SUBGROUP_SIZE_VARYING = 0 SUBGROUP_SIZE_UNIFORM = 1 SUBGROUP_SIZE_API_CONSTANT = 2 SUBGROUP_SIZE_FULL_SUBGROUPS = 3 So compute the API subgroup size value and compare it to the dispatch size to determine whether we need some bound checking. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `9ac192d79d` ("intel/fs: bound subgroup invocation read to dispatch size") Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21856>	2023-03-14 12:15:48 +00:00
Lionel Landwerlin	bf59cfcee1	intel/fs: prevent large vector ops generated by peephole_ffma Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21782>	2023-03-14 10:38:50 +00:00
Mark Janes	bc04e2daca	intel/fs: use generated helpers for Wa_1209978020 / Wa_18012201914 Wa_1209978020 is a clone of Wa_18012201914. Update references to refer to the originating bug, and use generated helpers to ensure it is applied to future platforms as needed. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21741>	2023-03-07 01:41:53 +00:00
Caio Oliveira	07de034791	intel/compiler: Drop brw_nir_lower_scoped_barriers Now that we handle scoped barriers with execution scope during NIR -> Backend IR translation, this lowering is not needed anymore. Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21634>	2023-03-07 00:41:13 +00:00
Alyssa Rosenzweig	952bd63d6d	nir/opt_barrier: Generalize to control barriers For GLSL, we want to optimize code like memoryBarrierBuffer(); controlBarrier(); into a single scoped_barrier intrinsic for the backend to consume. Now that backends can get scoped_barriers everywhere, what's left is enabling backends to combine these barriers together. We already have an Intel-specific pass for combining memory barriers; it just needs a teensy bit of generalization to allow combining all sorts of barriers together. This avoids code quality regression on Asahi when switching to purely scoped barriers. It's probably useful for other backends too. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21661>	2023-03-06 22:09:27 +00:00
Faith Ekstrand	83fd7a5ed1	intel: Use nir_lower_tex_options::lower_index_to_offset Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21546>	2023-03-06 21:38:32 +00:00
Faith Ekstrand	9a4641cf6b	intel/nir: Limit unaligned loads to vec4 This probably doesn't affect Vulkan or GL because they can't have anything bigger than a vec4 anyway unless it's a u64vec4 and those have to be at least 8B aligned. This may affect CL apps if they use __attribute__((packed)) on something with big vectors, depending on how LLVM decides to translate that. Fixes: `f8aa83f0c8` ("intel/nir: Use nir_lower_mem_access_bit_sizes()") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21524>	2023-03-03 02:00:39 +00:00
Faith Ekstrand	eb9a56b6ca	nir: Rename nir_mem_access_size_align::align_mul to align It's a simple alignment so calling it align_mul is a bit misleading. Suggested-by: M Henning <drawoc@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21524>	2023-03-03 02:00:39 +00:00
Faith Ekstrand	ca4d73ba36	nir: Add a combined alignment helper Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@colllabora.com> Reviewed-by: M Henning <drawoc@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21524>	2023-03-03 02:00:39 +00:00
Faith Ekstrand	116a851264	nir: Add mode filtering to lower_mem_access_bit_sizes Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: M Henning <drawoc@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21524>	2023-03-03 02:00:39 +00:00
Kenneth Graunke	96ba0344db	intel: Use common helpers for TCS passthrough shaders Rob added these new helpers a while back, which freedreno and radeonsi both share. We should use them too. The new helpers use variables and system value intrinsics, so we can drop the explicit binding table creation and just use the normal paths. Because we have to rewrite the system value uploading anyway, we drop the scrambling of the default tessellation levels on upload, and instead let the compiler go ahead and remap components like any normal shader. In theory, this results in more shuffling in the shader. In practice, we already do MOVs for message setup. In the passthrough shaders I looked at, this resulted in no extra instructions on Icelake (SIMD8 SINGLE_PATCH) and Tigerlake (8_PATCH). On Haswell, one shader grew by a single instruction for a pittance of cycles in a stage that isn't a performance bottleneck anyway. Avoiding remapping wasn't so much of an optimization as just the way that I originally wrote it. Not worth it. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20809>	2023-02-20 03:54:24 +00:00
Faith Ekstrand	f8aa83f0c8	intel/nir: Use nir_lower_mem_access_bit_sizes() This drops the Intel-specific pass in favor of the new generic one. No shader-db changes on Skylake or DG2. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21232>	2023-02-17 00:55:54 +00:00
Marcin Ślusarz	9d3e3c15f3	intel/compiler: replace gl_Layer & gl_ViewportIndex by 0 in fs if ms doesn't write it Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17620>	2023-02-14 08:24:51 +00:00
Alejandro Piñeiro	ba0bc7182d	anv: use shader_info->var_copies_lowered Instead of passing allow_copies as a parameter for brw_nir_optimize (so manually doing that tracking). Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19338>	2023-02-06 22:11:34 +00:00
Jason Ekstrand	949b42c4dc	intel/compiler: Convert wm_prog_key::multisample_fbo to a tri-state This allows us to communicate to the back-end that we don't actually know if the framebuffer is multisampled or not. No drivers set anything but ALWAYS/NEVER and we still have a few ALWAYS/NEVER assumptions but those should be asserted. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>	2023-02-06 09:12:18 +00:00
Jason Ekstrand	5644011f06	intel/compiler: Convert wm_prog_key::persample_interp to a tri-state This allows for the possibility that we may not know at compile time if sample shading is enabled through the API. While we're here, also document exactly what this bit means so we don't confuse ourselves. v2: Fixup coarse pixel values (Lionel) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>	2023-02-06 09:12:18 +00:00
Jason Ekstrand	d25e5310bc	intel/nir: Lower barycentrics to per-sample in a dedicated pass This is more similar to what we do for single-sample and it should be more clear going forward once our lowering gets more complex. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>	2023-02-06 09:12:17 +00:00
Kenneth Graunke	90a2137cd5	intel/compiler: Use LSC opcode enum rather than legacy BRW_AOPs This gets our logical atomic messages using the lsc_opcode enum rather than the legacy BRW_AOP_* defines. We have to translate one way or another, and using the modern set makes sense going forward. One advantage is that the lsc_opcode encoding has opcodes for both integer and floating point atomics in the same enum, whereas the legacy encoding used overlapping values (BRW_AOP_AND == 1 == BRW_AOP_FMAX), which made it impossible to handle both sensibly in common code. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20604>	2023-01-19 08:42:22 +00:00
Lionel Landwerlin	94bb4a13fa	intel/fs: make Wa_1806565034 conditional to non robust access Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20280>	2022-12-13 18:05:19 +00:00
Kenneth Graunke	8c2448d4e6	intel/compiler: Delete sampler key handling for planar format stuff i965 used these, but Gallium drivers do this lowering via a separate nir_lower_tex call from st/mesa. Vulkan drivers don't use these at all. Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20223>	2022-12-09 10:18:25 +00:00
Jason Ekstrand	b4dd3df227	intel/nir: Set has_base_workgroup_id for lower_compute_system_values This option didn't exist half a decade ago when I first implemented base workgroup support in ANV. It's cleaner to just have split system values like all the other zero_base+base things do. We currently only do this for COMPUTE and not KERNEL because it lets us avoid changing intel_clc for now. We can add KERNEL later if needed. We also don't do this lowering for task/mesh. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20068>	2022-12-01 04:56:48 +00:00
Lionel Landwerlin	6f2dbe6da1	anv: enable lower_shader_calls vectorizing On Q2RTX RT shaders : Totals from 7 (22.58% of 31) affected shaders: Instrs: 15453 -> 14418 (-6.70%) Cycles: 232647 -> 224959 (-3.30%) Send messages: 574 -> 481 (-16.20%) Spill count: 118 -> 106 (-10.17%) Fill count: 156 -> 140 (-10.26%) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20058>	2022-11-30 07:23:30 +00:00
Caio Oliveira	fbe40720e0	intel/compiler: Remove redundant argument from brw_nir_create_passthrough_tcs Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19831>	2022-11-19 00:35:56 +00:00
Lionel Landwerlin	bdf680cd3f	intel/fs: use nir_opt_ray_query_ranges Results on DG2 q2rtx shaders: Totals from 6 (12.24% of 49) affected shaders: Instrs: 88927 -> 54088 (-39.18%) Cycles: 4115088 -> 2536902 (-38.35%) Send messages: 2639 -> 1609 (-39.03%) Spill count: 1321 -> 613 (-53.60%) Fill count: 3130 -> 1104 (-64.73%) Scratch Memory Size: 22528 -> 18432 (-18.18%) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16593>	2022-11-11 15:17:08 +00:00
Ian Romanick	351b8c6aec	intel/fs: Enable nir_op_imul_32x16 and nir_op_umul_32x16 on pre-Gfx7 Even though Intel's CI doesn't test these old platforms anymore, the validation added in "intel/eu/validate: Validate integer multiplication source size restrictions" combined with full shader-db runs gives me confidence in the changes. Sandy Bridge total instructions in shared programs: 13902341 -> 13902167 (<.01%) instructions in affected programs: 30771 -> 30597 (-0.57%) helped: 66 / HURT: 0 total cycles in shared programs: 741795500 -> 741791931 (<.01%) cycles in affected programs: 987602 -> 984033 (-0.36%) helped: 28 / HURT: 5 Iron Lake total instructions in shared programs: 8365806 -> 8365754 (<.01%) instructions in affected programs: 1766 -> 1714 (-2.94%) helped: 10 / HURT: 0 total cycles in shared programs: 248542694 -> 248542378 (<.01%) cycles in affected programs: 29836 -> 29520 (-1.06%) helped: 9 / HURT: 0 GM45 total instructions in shared programs: 5187127 -> 5187101 (<.01%) instructions in affected programs: 891 -> 865 (-2.92%) helped: 5 / HURT: 0 total cycles in shared programs: 163643914 -> 163643750 (<.01%) cycles in affected programs: 22206 -> 22042 (-0.74%) helped: 5 / HURT: 0 Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19602>	2022-11-09 21:34:26 +00:00
Ian Romanick	f90d71055b	intel/compiler: Add and use a pass to generate imul_32x16 instructions Gfx8 and Gfx9 platforms are helped for cycles because now many instructions like mul(8) g12<1>D g10<8,8,1>D 6D become mul(8) g12<1>D g10<8,8,1>D 6W It is the same number of instructions, but the 32x16 multiply is a little faster. v2: Fix transposed hi and lo in "(hi >= INT16_MIN && lo <= INT16_MAX)". Noticed by Caio. Use nir_src_is_const instead of open coding it. Suggested by Caio. Broadwell and Skylake had similar results. (Skylake shown) total cycles in shared programs: 845748380 -> 845145547 (-0.07%) cycles in affected programs: 446346348 -> 445743515 (-0.14%) helped: 6017 HURT: 0 helped stats (abs) min: 2 max: 7380 x̄: 100.19 x̃: 8 helped stats (rel) min: <.01% max: 3.72% x̄: 0.41% x̃: 0.39% 95% mean confidence interval for cycles value: -113.37 -87.00 95% mean confidence interval for cycles %-change: -0.42% -0.41% Cycles are helped. Skylake Cycles in all programs: 8844820715 -> 8828897462 (-0.2%) Cycles helped: 47914 Cycles hurt: 1 No shader-db or fossil-db changes on any other Intel platform. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17718>	2022-11-08 00:02:16 +00:00
Francisco Jerez	5d4df3ac23	intel/compiler: Run extra fp64 lowering pass on devices that don't support int64. In some cases nir_lower_int64 will emit fp64 operations which aren't natively supported on any Intel hardware (e.g. ftrunc, frem). An extra pass of nir_opt_algebraic (for frem) and nir_lower_doubles is required in order to take care of them. This fixes several int64 test-cases on MTL hardware. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Mykhailo Skorokhodov <mykhailo.skorokhodov@globallogic.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19390>	2022-11-07 07:35:22 +00:00
Kenneth Graunke	88756cee8d	intel/compiler: Run nir_opt_large_constants before scalarizing consts nir_opt_large_constants balks at seeing a store_deref of a variable where the source is a vecN operation of multiple load_consts, and thinks that isn't a constant, so it should not bother promoting it. Unfortunately, we were running nir_lower_load_const_to_scalar before nir_opt_large_constants, so this prevented a ton of constant promotion. This commit /used to help/ some shaders in shader-db. Presumably since !16770 landed, those shaders were already helped. Currently ther are no shader-db changes on any Intel platform. Fossil-db results: All Intel platforms had similar results. (Ice Lake shown) Instructions in all programs: 141998227 -> 141421756 (-0.4%) Instructions helped: 12515 Instructions hurt: 237 SENDs in all programs: 7437925 -> 7468033 (+0.4%) SENDs hurt: 12806 Cycles in all programs: 9161655753 -> 9132869800 (-0.3%) Cycles helped: 10163 Cycles hurt: 2637 Spills in all programs: 19977 -> 18678 (-6.5%) Spills helped: 384 Spills hurt: 40 Fills in all programs: 32863 -> 31396 (-4.5%) Fills helped: 385 Fills hurt: 42 Lost: 1 Lots of Shadow of the Tomb Raider fragment shaders and Batman Arkham Origins vertex shaders were hurt for SENDs in this commit. A couple Aztec Ruins compute shaders and Spaceship shaders (multiple stages) were also hurt. All of the shaders hurt for spills or fills were Spaceship compute shaders. Nearly all of the shaders helped were Shadow of the Tomb Raider fragmenet shaders. One Spaceship shader was reall, REALLY helped: Spills helped fossils/fossil-db/Spaceship.run.9f90a2a226fcc57f.1.foz/0b507d3abe2e3c28/compute: 321 -> 13 (-96.0%) Fills helped fossils/fossil-db/Spaceship.run.9f90a2a226fcc57f.1.foz/0b507d3abe2e3c28/compute: 279 -> 21 (-92.5%) Overall this seems like an improvement, but we may want to actually run these few benchmarks before landing. Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16539>	2022-11-01 14:55:21 -07:00
Alyssa Rosenzweig	941c37c085	nir/lower_idiv: Remove imprecise_32bit_lowering NIR has two implementations of lower_idiv, keyed on the imprecise_32bit_lowering flag. This flag is misleading: the results when setting this flag "imprecise", they're completely wrong for some values. If a backend has a native implementation of umul_high, the correct path isn't that much more expensive. If it doesn't, it's substantially slower for highp integer divison... but in practice, non-constant highp integer division is pretty rare. After a painful migration of the tree, this code path has no more users. Remove it so nobody else gets the bright idea of using it again. Closes: #6555 Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19303>	2022-10-27 19:37:14 +00:00
Tapani Pälli	1e51383258	intel/compiler: run nir_opt_idiv_const before nir_lower_idiv Integer div lowering can potentially create a lot of code that is not removed later on. Running const lowering pass first can be used to eliminate that code. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19157>	2022-10-20 15:35:48 +03:00
Kenneth Graunke	2dfab687ec	intel/compiler: Vectorize gl_TessLevelInner/Outer[] writes [v2] Setting the NIR options takes care of iris thanks to the common st/mesa linking code, and updating brw_nir_link_shaders should handle anv. The main effort here is updating remap_tess_levels, which needs to handle vector stores, writemasking, and swizzling. Unfortunately, we also need to continue handling the existing single-component access because it's used for TES inputs, which we don't vectorize. We could try to vectorize TES inputs too, but they're all pushed anyway, so it wouldn't buy us much other than deleting this code. Also, we do have opt_combine_stores, but not one for loads. One limitation of using nir_vectorize_tess_levels is that it works on variables, and so isn't able to combine outer/inner writes that happen to live in the same vec4 slot (for triangle domains). That said, it's still better than before. For writes, we allow the intrinsics to supply up to the full size of the variable (vec4 for outer, vec2 for inner) even if the domain only requires a subset of those components (i.e. triangles needs 3). shader-db results on Icelake: total instructions in shared programs: 19600314 -> 19597528 (-0.01%) instructions in affected programs: 65338 -> 62552 (-4.26%) helped: 271 / HURT: 0 helped stats (abs) min: 6 max: 24 x̄: 10.28 x̃: 12 helped stats (rel) min: 1.30% max: 18.18% x̄: 5.80% x̃: 7.59% 95% mean confidence interval for instructions value: -10.71 -9.85 95% mean confidence interval for instructions %-change: -6.17% -5.43% Instructions are helped. total cycles in shared programs: 851842332 -> 851808165 (<.01%) cycles in affected programs: 618577 -> 584410 (-5.52%) helped: 271 / HURT: 0 helped stats (abs) min: 64 max: 540 x̄: 126.08 x̃: 111 helped stats (rel) min: 2.57% max: 37.97% x̄: 6.12% x̃: 5.06% 95% mean confidence interval for cycles value: -135.35 -116.80 95% mean confidence interval for cycles %-change: -6.67% -5.57% Cycles are helped. total sends in shared programs: 1025238 -> 1024308 (-0.09%) sends in affected programs: 6454 -> 5524 (-14.41%) helped: 271 / HURT: 0 helped stats (abs) min: 2 max: 8 x̄: 3.43 x̃: 4 helped stats (rel) min: 5.71% max: 25.00% x̄: 14.98% x̃: 17.39% 95% mean confidence interval for sends value: -3.57 -3.29 95% mean confidence interval for sends %-change: -15.42% -14.54% Sends are helped. According to Felix DeGrood, this results in a 10% improvement in the draw call time for certain draw calls from Strange Brigade. v2: Fix assertions about number of components and add more of them. Combine the quads and triangles handling as it's nearly identical. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> [v1] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19061>	2022-10-13 11:38:21 -07:00
Kenneth Graunke	b61b1d5a4c	Revert "intel/compiler: Vectorize gl_TessLevelInner/Outer[] writes" This reverts commit `abba55382f`. The assertions I added late in the process broke shader-db, and my quick fix broke CI, so let's just revert it for now and I'll resubmit this later when it's working better. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7385 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18895>	2022-09-29 17:39:18 -07:00
Lionel Landwerlin	23c7142cd6	anv: disable SIMD16 for RT shaders Since divergence is a lot more likely in RT than compute, it makes sense to limit ourselves to SIMD8. The trampoline shader defaults to SIMD16 since this one is uniform. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16970>	2022-09-28 05:38:37 +00:00
Lionel Landwerlin	8fc7a98e31	intel/fs: disable split_array_vars on opencl kernels Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16970>	2022-09-28 05:38:36 +00:00
Kenneth Graunke	abba55382f	intel/compiler: Vectorize gl_TessLevelInner/Outer[] writes Setting the NIR options takes care of iris thanks to the common st/mesa linking code, and updating brw_nir_link_shaders should handle anv. The main effort here is updating remap_tess_levels, which needs to handle vector stores, writemasking, and swizzling. Unfortunately, we also need to continue handling the existing single-component access because it's used for TES inputs, which we don't vectorize. We could try to vectorize TES inputs too, but they're all pushed anyway, so it wouldn't buy us much other than deleting this code. Also, we do have opt_combine_stores, but not one for loads. One limitation of using nir_vectorize_tess_levels is that it works on variables, and so isn't able to combine outer/inner writes that happen to live in the same vec4 slot (for triangle domains). That said, it's still better than before. For writes, we allow the intrinsics to supply up to the full size of the variable (vec4 for outer, vec2 for inner) even if the domain only requires a subset of those components (i.e. triangles needs 3). shader-db results on Icelake: total instructions in shared programs: 19605070 -> 19602284 (-0.01%) instructions in affected programs: 65338 -> 62552 (-4.26%) helped: 271 / HURT: 0 helped stats (abs) min: 6 max: 24 x̄: 10.28 x̃: 12 helped stats (rel) min: 1.30% max: 18.18% x̄: 5.80% x̃: 7.59% 95% mean confidence interval for instructions value: -10.71 -9.85 95% mean confidence interval for instructions %-change: -6.17% -5.43% Instructions are helped. total cycles in shared programs: 851854659 -> 851820320 (<.01%) cycles in affected programs: 618749 -> 584410 (-5.55%) helped: 271 / HURT: 0 helped stats (abs) min: 69 max: 540 x̄: 126.71 x̃: 108 helped stats (rel) min: 2.57% max: 37.97% x̄: 6.17% x̃: 5.06% 95% mean confidence interval for cycles value: -135.89 -117.54 95% mean confidence interval for cycles %-change: -6.72% -5.63% Cycles are helped. total sends in shared programs: 1025285 -> 1024355 (-0.09%) sends in affected programs: 6454 -> 5524 (-14.41%) helped: 271 / HURT: 0 helped stats (abs) min: 2 max: 8 x̄: 3.43 x̃: 4 helped stats (rel) min: 5.71% max: 25.00% x̄: 14.98% x̃: 17.39% 95% mean confidence interval for sends value: -3.57 -3.29 95% mean confidence interval for sends %-change: -15.42% -14.54% Sends are helped. According to Felix DeGrood, this results in a 10% improvement in the draw call time for certain draw calls from Strange Brigade. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17944>	2022-09-27 18:17:56 -07:00
Marcin Ślusarz	3c96959bbc	intel/compiler: print shader after successful brw_nir_lower_shading_rate_output Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18702>	2022-09-20 17:23:45 +00:00
Pierre-Eric Pelloux-Prayer	70891edd97	nir: add a nir_opt_if_options enum And don't enable nir_opt_if_optimize_phi_true_false on radeonsi with LLVM 14 because it crashes Blender. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6976 Cc: mesa-stable Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17949>	2022-08-10 12:55:39 +00:00
Jason Ekstrand	87ab287436	vulkan: Call lower_clip_cull_distance_arrays in vk_spirv_to_nir Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17644>	2022-07-21 21:18:48 +00:00

1 2 3 4 5 ...

331 commits