fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-22 13:30:12 +01:00

Author	SHA1	Message	Date
Caio Oliveira	25384dccc0	intel/brw: Remove 'fs' prefix from passes filenames Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32813>	2025-01-02 18:11:05 +00:00
Ian Romanick	f0bf68dd25	brw/const: Remove TODO that isn't allowed by the hardware There are a lot of restrictions for bfloat16. The one that prevents this very useful optimization from being possible is, "Broadcast of bfloat16 scalar is not supported." Part of the reason this MR exists is to build up to implementing BF support, and there are a couple more commits that implement this. However, it fails on both real hardware and simulation: Instruction is: mad (8\|M0) r6.0<1>:f 0xBF80:bf r2.0<8;1>:f r64.0<0>:f In bfloat/float mixed mode, bfloat src must be packed. Alas. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32436>	2024-12-13 01:24:26 +00:00
Ian Romanick	99d3755bdd	brw/const: Allow HF constants in MAD on Gfx11 These can't mix with F values, but if the non-constant sources are already HF, this is allowed in src0. No shader-db changes on any Intel platform. fossil-db: Ice Lake Totals: Instrs: 236027458 -> 236027442 (-0.00%) Cycle count: 24515944704 -> 24515945379 (+0.00%) Totals from 8 (0.00% of 798454) affected shaders: Instrs: 10226 -> 10210 (-0.16%) Cycle count: 58567 -> 59242 (+1.15%) Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32436>	2024-12-13 01:24:26 +00:00
Ian Romanick	4c462b6b32	brw/const: Allow constants in integer MAD Nothing can generate this currently, but a future commit will. The Bspec and experimentation support the following limitations: - Gfx11: Either src0 or src2 can be W or UW. - Gfx12: Either src0 or src2 can be W or UW. - Gfx12.5: Both src0 and src2 can be W or UW. - Gfx20: Both src0 and src2 can be W or UW. v2: Add missing break statement. v3: Leave the MAD handling in the case with the other 3 source instructions. Suggested by Caio. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32436>	2024-12-13 01:24:26 +00:00
Ian Romanick	9fa6b68f9e	brw/const: Refactor checking whether an immediate source is allowed Should be no functional change here. This simplifies some later changes. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32436>	2024-12-13 01:24:26 +00:00
Ian Romanick	d9b019b683	brw/copy: Don't try to be clever about ADD3 constant propagation Always propagate into any source. Let commute_immedates and constant combining sort out the mess. It's literally their job. No shader-db changes on any Intel platform. The fossil-db changes just appear to be subtle changes in register allocation if the immediate source changes from src0 to src2. v2: Update the comment in commute_immediates. Suggested by Caio. fossil-db: Lunar Lake, Meteor Lake, and DG2 had similar results. (Lunar Lake shown) Totals: Cycle count: 31610720510 -> 31610720660 (+0.00%); split: -0.00%, +0.00% Totals from 8 (0.00% of 702433) affected shaders: Cycle count: 5522382 -> 5522532 (+0.00%); split: -0.00%, +0.00% Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32436>	2024-12-13 01:24:26 +00:00
Ian Romanick	a84e3a0f55	brw/const: Allow mixing signed and unsigned immediate sources No shader-db or fossil-db changes on any Intel platform. This commit just prevents issues with a later commit, "brw/copy: Don't try to be clever about ADD3 constant propagation." v2: Use 'can_promote = true; break;' instead of 'return true;'. Suggested by Caio. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32436>	2024-12-13 01:24:26 +00:00
Kenneth Graunke	7bed11fbde	intel/brw: Allow immediates in the BFE instruction on Gfx12+ We weren't allowing immediates in BFE at all. Gfx12+ supports immediates in src0 (value) and src2 (width), but not src1 (offset). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31437>	2024-10-24 21:31:28 +00:00
Kenneth Graunke	4cb67cb07a	intel/brw: Use whole 512-bit registers in constant combining on Xe2 Xe2 increased the register size from 256-bits to 512-bits. So we can store 32 16-bit values in a register, rather than 16 values. Prior to this patch, we hadn't updated the pass, so the second half of each of our registers was unused. Backport-to: 24.2 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31499>	2024-10-15 18:14:37 +00:00
Kenneth Graunke	d9e5022650	intel/brw: Delete more Gfx8 code from brw_fs_combine_constants These platforms are supported by elk, not brw. Backport-to: 24.2 Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31499>	2024-10-15 18:14:37 +00:00
Caio Oliveira	8a39231e4f	intel/brw: Move calculate_cfg out of fs_visitor Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30169>	2024-07-25 15:37:13 +00:00
Caio Oliveira	8ba8e33c39	intel/brw: Simplify @file annotations Doxygen documentation says > If the file name is omitted (i.e. the line after \file is left > blank) then the documentation block that contains the \file command will > belong to the file it is located in. so we can omit the filename itself when using the annotation. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30168>	2024-07-22 22:48:03 +00:00
Caio Oliveira	3670c24740	intel/brw: Replace uses of fs_reg with brw_reg And remove the fs_reg alias. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29791>	2024-07-03 02:53:19 +00:00
Caio Oliveira	d00329e821	intel/brw: Replace some fs_reg constructors with functions Create three helper functions for ATTR, UNIFORM and VGRF creation. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29791>	2024-07-03 02:53:18 +00:00
Ian Romanick	033405cd4b	intel/brw: Combine constants and constant propagation for CSEL No shader-db or fossil-db changes on any Intel platform. This ends up begin helpful in "intel/brw: Use range analysis to optimize fsign." v2: Add integer CSEL support v3: Massive simplification (-20 lines!) of constant propagation logic. Suggested by Ken. Add missing CSEL case in supports_src_as_imm. Noticed by Ken. v4: While MAD can mix F and HF sources on some platforms, CSEL cannot. Found by skqp on TGL. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v3] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29095>	2024-05-14 01:28:20 +00:00
Kenneth Graunke	545bb8fb6f	intel/brw: Replace type_sz and brw_reg_type_to_size with brw_type_size_* Both of these helpers do the same thing. We now have brw_type_size_bits and brw_type_size_bytes and can use whichever makes sense in that place. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>	2024-04-25 11:41:48 +00:00
Kenneth Graunke	007d891239	intel/brw: Use newer brw_type_is_* shorter names Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>	2024-04-25 11:41:48 +00:00
Kenneth Graunke	873fcdff38	intel/brw: Stop using long BRW_REGISTER_TYPE enum names s/BRW_REGISTER_TYPE/BRW_TYPE/g Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>	2024-04-25 11:41:48 +00:00
Kenneth Graunke	c45e235df5	intel/brw: Drop NF type support Icelake removed the PLN instruction for interpolating fragment shader inputs, instead adding a special "Native Float" (NF) data type which was a 66-bit floating point data type that could only be used with the accumulator. On Tigerlake, they dropped NF support in favor of just doing the interpolation with MAD instructions. We stopped using NF years ago (commit `9ea90aae1e`), instead just using the fs_visitor::lower_linterp() pass to emit MADs. Since this existed only for a short time, and had very limited utility, we drop it from the compiler. One downside is that we can no longer disassemble Icelake shaders containing NF types properly, but I doubt anyone really minds. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>	2024-04-25 11:41:48 +00:00
Ian Romanick	b835784dde	intel/brw: Remove last vestiges of could_coissue Most of the obvious bits were removed by `7ac5696157` ("intel/brw: Remove Gfx8- code from backend passes"). No shader-db or fossil-db changes on any Intel platform. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28342>	2024-03-23 01:29:22 +00:00
Yonggang Luo	1ac1c0843f	treewide: Replace usage of macro DEBUG with MESA_DEBUG when possible This is achieved by the following steps: #ifndef DEBUG => #if !MESA_DEBUG defined(DEBUG) => MESA_DEBUG #ifdef DEBUG => #if MESA_DEBUG This is done by replace in vscode excludes docs,.rs,addrlib,src/imgui,.sh,src/intel/vulkan/grl/gpu These are safe because those files should keep DEBUG macro is already excluded; and not directly replace DEBUG, as we have some symbols around it. Use debug or NDEBUG instead of DEBUG in comments when proper This for reduce the usage of DEBUG, so it's easier migrating to MESA_DEBUG These are found when migrating DEBUG to MESA_DEBUG, these are all comment update, so it's safe Replace comment /* DEBUG / and / !DEBUG / with proper / MESA_DEBUG / or / !MESA_DEBUG */ manually DEBUG \|\| !NDEBUG -> MESA_DEBUG \|\| !NDEBUG !DEBUG && NDEBUG -> !(MESA_DEBUG \|\| !NDEBUG) Replace the DEBUG present in comment with proper new MESA_DEBUG manually Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Acked-by: David Heidelberg <david.heidelberg@collabora.com> Reviewed-by: Eric Engestrom <eric@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28092>	2024-03-22 18:22:34 +00:00
Ian Romanick	d9674cbe7d	intel/brw: Combine constants for src0 of POW instructions too I tried this when I was working on MR !7698, and it didn't have much affect back then. Maybe I've added more stuff to my fossil-db? Gfx12 platforms (Tiger Lake and DG2) are unaffected because the POW instruction was removed. shader-db: Ice Lake and Skylake had similar results. (Ice Lake shown) total instructions in shared programs: 20301933 -> 20301900 (<.01%) instructions in affected programs: 9077 -> 9044 (-0.36%) helped: 33 / HURT: 0 total cycles in shared programs: 842797624 -> 842799471 (<.01%) cycles in affected programs: 1361911 -> 1363758 (0.14%) helped: 35 / HURT: 111 LOST: 0 GAINED: 9 fossil-db: Ice Lake and Skylake had similar results. (Ice Lake shown) Totals: Instrs: 165510222 -> 165510163 (-0.00%) Cycles: 15125195835 -> 15125194484 (-0.00%); split: -0.00%, +0.00% Spill count: 45204 -> 45196 (-0.02%) Fill count: 74157 -> 74149 (-0.01%) Totals from 65 (0.01% of 656118) affected shaders: Instrs: 57426 -> 57367 (-0.10%) Cycles: 1667918 -> 1666567 (-0.08%); split: -0.11%, +0.03% Spill count: 137 -> 129 (-5.84%) Fill count: 515 -> 507 (-1.55%) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27552>	2024-03-12 21:31:30 +00:00
Ian Romanick	e7480f94c1	intel/brw: Combine constants for src0 of integer multiply too The majority of cases that would have been affected by this actually had both sources as integer constants. The earlier commit "intel/rt: Don't directly generate umul_32x16" allowed those to be constant folded. v2: Move the a-1 block to be near the existing a-1 block. No shader-db changes on any Intel platform. fossil-db results: All Intel platforms had similar results. (Ice Lake shown) Totals: Instrs: 165510246 -> 165510222 (-0.00%) Cycles: 15125198238 -> 15125195835 (-0.00%); split: -0.00%, +0.00% Totals from 46 (0.01% of 656118) affected shaders: Instrs: 36010 -> 35986 (-0.07%) Cycles: 2613658 -> 2611255 (-0.09%); split: -0.17%, +0.07% Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27552>	2024-03-12 21:31:30 +00:00
Caio Oliveira	7ac5696157	intel/brw: Remove Gfx8- code from backend passes Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27691>	2024-02-28 05:45:38 +00:00
Caio Oliveira	4f09ad9dee	intel/brw: Pull opt_combine_constants out of fs_visitor Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26887>	2024-02-26 20:54:24 +00:00
Caio Oliveira	4dbf9181cd	intel/compiler: Fix rebuilding the CFG in fs_combine_constants When building the CFG the instructions are taken of the list in fs_visitor and added to the lists inside each block. The single "exec_node" in the instruction is used for those memberships. In the case the pass rebuilt the CFG, it had no instructions, so calculate_cfg() had nothing to work with. For now fix the bug by pulling all the instructions back to the original list. We can do better here, but punting until upcoming work on CFG itself. Issue found in an unpublished CTS test. Small reproduction in our unit tests now enabled. Fixes: `65237f8bbc` ("intel/fs: Don't add MOV instructions to DO blocks in combine constants") Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27131>	2024-01-19 01:59:36 +00:00
Caio Oliveira	cf730adc58	intel/compiler: Make fs_builder include fs_visitor and not the other way This will allow fs_builder have a reference to an fs_visitor (a "fs_shader" really), instead of a reference to a backend_shader. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26323>	2023-12-12 19:36:14 +00:00
Caio Oliveira	5b8ec015f2	intel/compiler: Don't use fs_visitor::bld in remaining places The remaining users can simply create a new builder at_end() if needed. In many places a new builder object is already being constructed, so just give more specific instructions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26323>	2023-12-12 19:36:14 +00:00
Ian Romanick	65237f8bbc	intel/fs: Don't add MOV instructions to DO blocks in combine constants There was a subtle bug related to CFG tracking. Namely, some branch instructions may point only to the block after the DO instruction for the loop. If the MOV instructions are in the DO block, the may not have liveness properly tracked. Like in !25132, having the MOV instructions in blocks that might contain other instructions helps scheduling. shader-db: All Broadwell and newer Intel GPUs had similar results (Ice Lake shown) total cycles in shared programs: 848577248 -> 848557268 (<.01%) cycles in affected programs: 78256396 -> 78236416 (-0.03%) helped: 361 / HURT: 18 fossil-db: All Skylake and newer Intel GPUs had similar results (Ice Lake shown) Totals: Cycles: 15021501924 -> 15021372904 (-0.00%); split: -0.00%, +0.00% Totals from 735 (0.11% of 656080) affected shaders: Cycles: 676429502 -> 676300482 (-0.02%); split: -0.02%, +0.00% Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26439>	2023-12-08 20:21:28 +00:00
Ian Romanick	927a24db14	intel/fs: New VGRF packing scheme for constant combining Each block is processed separately. VGRF channels that are allocated to values that are only used in a particular block are made available in other blocks. This is almost always an improvement, but there are some pessimal cases where it goes horribly wrong. Imagine a shader with two blocks. In that shader, the first block has 5 constants used in the first block and the second block. Three other constants are only used in the first block. The second block has 15 constants that are used only in the block. The static VGRF usage is 3 regardless of packing. However, scheduling may be able to shorten the live range of the first VGRF when it only has values that came from the first block (because three of the values are dead on entry to the second block). This used to occurs in a Mad Max shader on Broadwell. That shader went from 0:0 spills:fills to 107:52. Some changes over the last year, I'm assuming !13734, have prevented this case from occuring. This change created a lot of churn on Haswell and Ivy Bridge. This seems to be primarily due to all the extra constants used for coissue, but I did not investigate very deeply. On older platforms, there were no changes to spills or fills. As a result, this is only used on Broadwell and newer platforms. v2: Update expected checksum for pixmark-piano-v2.trace on gl-zink-anv-tgl. See #9714 for more details. shader-db results: Tiger Lake total instructions in shared programs: 21101332 -> 21102084 (<.01%) instructions in affected programs: 863686 -> 864438 (0.09%) helped: 463 / HURT: 437 total cycles in shared programs: 790573225 -> 790664391 (0.01%) cycles in affected programs: 92546803 -> 92637969 (0.10%) helped: 558 / HURT: 629 total spills in shared programs: 3959 -> 3951 (-0.20%) spills in affected programs: 184 -> 176 (-4.35%) helped: 2 / HURT: 0 total fills in shared programs: 2639 -> 2631 (-0.30%) fills in affected programs: 184 -> 176 (-4.35%) helped: 2 / HURT: 0 LOST: 1 GAINED: 5 Ice Lake and Skylake had similar results. (Ice Lake shown) total instructions in shared programs: 19945216 -> 19944711 (<.01%) instructions in affected programs: 139569 -> 139064 (-0.36%) helped: 66 / HURT: 3 total cycles in shared programs: 858410082 -> 857381323 (-0.12%) cycles in affected programs: 383825958 -> 382797199 (-0.27%) helped: 1012 / HURT: 1055 total spills in shared programs: 6190 -> 6116 (-1.20%) spills in affected programs: 891 -> 817 (-8.31%) helped: 66 / HURT: 3 total fills in shared programs: 7382 -> 7238 (-1.95%) fills in affected programs: 1538 -> 1394 (-9.36%) helped: 66 / HURT: 3 LOST: 5 GAINED: 8 Broadwell total instructions in shared programs: 17820886 -> 17812515 (-0.05%) instructions in affected programs: 800512 -> 792141 (-1.05%) helped: 385 / HURT: 1 total cycles in shared programs: 904482935 -> 903102070 (-0.15%) cycles in affected programs: 422427015 -> 421046150 (-0.33%) helped: 1091 / HURT: 812 total spills in shared programs: 17908 -> 16576 (-7.44%) spills in affected programs: 9459 -> 8127 (-14.08%) helped: 386 / HURT: 0 total fills in shared programs: 25397 -> 22354 (-11.98%) fills in affected programs: 15504 -> 12461 (-19.63%) helped: 385 / HURT: 1 LOST: 2 GAINED: 2 No shader-db changes on Haswell or older platforms. fossil-db results: Tiger Lake Instructions in all programs: 156881463 -> 156890970 (+0.0%) Instructions helped: 9033 Instructions hurt: 10285 Cycles in all programs: 7532597466 -> 7529647924 (-0.0%) Cycles helped: 10548 Cycles hurt: 13667 Spills in all programs: 5490 -> 5110 (-6.9%) Spills helped: 100 Spills hurt: 3 Fills in all programs: 6123 -> 5752 (-6.1%) Fills helped: 100 Fills hurt: 3 Gained: 17 Lost: 47 Ice Lake Instructions in all programs: 141309644 -> 141309603 (-0.0%) Instructions helped: 9 Instructions hurt: 4 Cycles in all programs: 9095812690 -> 9097008049 (+0.0%) Cycles helped: 14288 Cycles hurt: 16381 Spills in all programs: 7418 -> 7404 (-0.2%) Spills helped: 9 Spills hurt: 4 Fills in all programs: 8326 -> 8321 (-0.1%) Fills helped: 9 Fills hurt: 4 Skylake Instructions in all programs: 131872347 -> 131870690 (-0.0%) Instructions helped: 111 Instructions hurt: 3 Cycles in all programs: 8800835649 -> 8802483884 (+0.0%) Cycles helped: 9415 Cycles hurt: 9678 Spills in all programs: 6917 -> 6476 (-6.4%) Spills helped: 111 Spills hurt: 3 Fills in all programs: 7584 -> 7354 (-3.0%) Fills helped: 111 Fills hurt: 3 Lost: 5 Tested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7698>	2023-08-29 19:01:37 +00:00
Ian Romanick	c506d7e511	intel/fs: Combine constants for integer instructions too v2: Remove type change for SHR with negation. This was a leftover from a previous attempt to deal with SHR and negation. Now all right-shifts with unsigned parameters are marked as not being able to have source modifiers. v3: Disallow negations on right shifts of unsigned sources by setting the no_negations flag in add_candidate_immediate. This eliminates the need to exclude SHR in can_do_source_mods. Tiger Lake total instructions in shared programs: 21102817 -> 21099443 (-0.02%) instructions in affected programs: 296796 -> 293422 (-1.14%) helped: 92 / HURT: 356 total cycles in shared programs: 790564691 -> 790393358 (-0.02%) cycles in affected programs: 36456886 -> 36285553 (-0.47%) helped: 171 / HURT: 286 total spills in shared programs: 3951 -> 3959 (0.20%) spills in affected programs: 176 -> 184 (4.55%) helped: 0 / HURT: 2 total fills in shared programs: 2631 -> 2639 (0.30%) fills in affected programs: 176 -> 184 (4.55%) helped: 0 / HURT: 2 LOST: 0 GAINED: 4 Ice Lake total instructions in shared programs: 19954204 -> 19949122 (-0.03%) instructions in affected programs: 40301 -> 35219 (-12.61%) helped: 23 / HURT: 2 total cycles in shared programs: 858377735 -> 858462082 (<.01%) cycles in affected programs: 75537286 -> 75621633 (0.11%) helped: 124 / HURT: 319 total spills in shared programs: 6255 -> 6190 (-1.04%) spills in affected programs: 392 -> 327 (-16.58%) helped: 1 / HURT: 2 total fills in shared programs: 7813 -> 7382 (-5.52%) fills in affected programs: 942 -> 511 (-45.75%) helped: 1 / HURT: 2 LOST: 0 GAINED: 3 Skylake total instructions in shared programs: 18049362 -> 18044440 (-0.03%) instructions in affected programs: 48317 -> 43395 (-10.19%) helped: 26 / HURT: 2 total cycles in shared programs: 844884806 -> 844915655 (<.01%) cycles in affected programs: 76137133 -> 76167982 (0.04%) helped: 171 / HURT: 293 total spills in shared programs: 6148 -> 6149 (0.02%) spills in affected programs: 595 -> 596 (0.17%) helped: 4 / HURT: 2 total fills in shared programs: 7484 -> 7067 (-5.57%) fills in affected programs: 1226 -> 809 (-34.01%) helped: 4 / HURT: 2 LOST: 0 GAINED: 8 Broadwell total instructions in shared programs: 17826844 -> 17821805 (-0.03%) instructions in affected programs: 60687 -> 55648 (-8.30%) helped: 28 / HURT: 8 total cycles in shared programs: 905332682 -> 904369499 (-0.11%) cycles in affected programs: 76743509 -> 75780326 (-1.26%) helped: 179 / HURT: 225 total spills in shared programs: 17922 -> 17908 (-0.08%) spills in affected programs: 2495 -> 2481 (-0.56%) helped: 6 / HURT: 8 total fills in shared programs: 26290 -> 25397 (-3.40%) fills in affected programs: 2606 -> 1713 (-34.27%) helped: 8 / HURT: 6 LOST: 1 GAINED: 1 Haswell total instructions in shared programs: 16678878 -> 16674444 (-0.03%) instructions in affected programs: 78458 -> 74024 (-5.65%) helped: 87 / HURT: 6 total cycles in shared programs: 880189381 -> 880301043 (0.01%) cycles in affected programs: 29956463 -> 30068125 (0.37%) helped: 169 / HURT: 163 total spills in shared programs: 14428 -> 14378 (-0.35%) spills in affected programs: 2384 -> 2334 (-2.10%) helped: 8 / HURT: 6 total fills in shared programs: 16975 -> 16881 (-0.55%) fills in affected programs: 1334 -> 1240 (-7.05%) helped: 10 / HURT: 4 Ivy Bridge total instructions in shared programs: 15706048 -> 15706035 (<.01%) instructions in affected programs: 9941 -> 9928 (-0.13%) helped: 13 / HURT: 0 total cycles in shared programs: 433618834 -> 433624637 (<.01%) cycles in affected programs: 12926714 -> 12932517 (0.04%) helped: 52 / HURT: 41 Sandy Bridge total cycles in shared programs: 741223552 -> 741223443 (<.01%) cycles in affected programs: 19814 -> 19705 (-0.55%) helped: 14 / HURT: 0 No changes on Iron Lake or GM45 fossil-db changes: Tiger Lake Instructions in all programs: 156858030 -> 156905532 (+0.0%) Instructions helped: 3915 Instructions hurt: 15411 Cycles in all programs: 7529667771 -> 7532117340 (+0.0%) Cycles helped: 10260 Cycles hurt: 9990 Spills in all programs: 5610 -> 5457 (-2.7%) Spills helped: 18 Fills in all programs: 6274 -> 6091 (-2.9%) Fills helped: 18 Gained: 2 Lost: 16 Ice Lake Instructions in all programs: 141308082 -> 141303083 (-0.0%) Instructions helped: 574 Instructions hurt: 172 Cycles in all programs: 9091361325 -> 9094622766 (+0.0%) Cycles helped: 8764 Cycles hurt: 11702 Spills in all programs: 7531 -> 7385 (-1.9%) Spills helped: 19 Fills in all programs: 8462 -> 8294 (-2.0%) Fills helped: 19 Gained: 22 Lost: 15 Skylake Instructions in all programs: 131872162 -> 131867263 (-0.0%) Instructions helped: 566 Instructions hurt: 172 Cycles in all programs: 8795095440 -> 8799676943 (+0.1%) Cycles helped: 8333 Cycles hurt: 12182 Spills in all programs: 7006 -> 6884 (-1.7%) Spills helped: 13 Fills in all programs: 7696 -> 7552 (-1.9%) Fills helped: 13 Gained: 24 Lost: 1 Tested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7698>	2023-08-29 19:01:36 +00:00
Ian Romanick	64c251bb3a	intel/fs: Combine constants for SEL instructions too It is very common to have bcsel where the second and third sources are both constants. This results in a situation where we would want to emit a SEL with two constant sources, but that's not allowed. Previously, we would load both constants into registers, then let constant propagation copy the last constant into the SEL instruction. This results in the constant using an entire SIMD register instead of a single channel. Instead, copy propagate both sources, then let the combine-constants pass do its thing. In the worst case, this stores the constant in a single channel of the SIMD register. In the best case, it reuses a value that was loaded into a register to satisfy another instruction. shader-db results: Tiger Lake, Ice Lake, and Skylake had similar results. (Ice Lake shown) total instructions in shared programs: 19951549 -> 19948709 (-0.01%) instructions in affected programs: 482795 -> 479955 (-0.59%) helped: 1184 / HURT: 3 total cycles in shared programs: 858584724 -> 858205341 (-0.04%) cycles in affected programs: 356168375 -> 355788992 (-0.11%) helped: 1448 / HURT: 1195 total spills in shared programs: 6569 -> 6255 (-4.78%) spills in affected programs: 912 -> 598 (-34.43%) helped: 58 / HURT: 0 total fills in shared programs: 8218 -> 7813 (-4.93%) fills in affected programs: 1570 -> 1165 (-25.80%) helped: 58 / HURT: 0 LOST: 6 GAINED: 16 Broadwell total instructions in shared programs: 17819660 -> 17819389 (<.01%) instructions in affected programs: 1078129 -> 1077858 (-0.03%) helped: 1067 / HURT: 304 total cycles in shared programs: 904722624 -> 905035016 (0.03%) cycles in affected programs: 362583117 -> 362895509 (0.09%) helped: 1381 / HURT: 1123 total spills in shared programs: 17884 -> 17922 (0.21%) spills in affected programs: 5088 -> 5126 (0.75%) helped: 55 / HURT: 152 total fills in shared programs: 25533 -> 26290 (2.96%) fills in affected programs: 12992 -> 13749 (5.83%) helped: 61 /HURT: 295 LOST: 7 GAINED: 24 Haswell total instructions in shared programs: 16678080 -> 16673976 (-0.02%) instructions in affected programs: 1162893 -> 1158789 (-0.35%) helped: 1584 / HURT: 7 total cycles in shared programs: 880180082 -> 879932525 (-0.03%) cycles in affected programs: 364067522 -> 363819965 (-0.07%) helped: 1226 / HURT: 976 total spills in shared programs: 14937 -> 14428 (-3.41%) spills in affected programs: 7866 -> 7357 (-6.47%) helped: 351 / HURT: 5 total fills in shared programs: 17572 -> 16975 (-3.40%) fills in affected programs: 11028 -> 10431 (-5.41%) helped: 350 / HURT: 3 LOST: 8 GAINED: 16 Ivy Bridge total instructions in shared programs: 15704044 -> 15703158 (<.01%) instructions in affected programs: 304513 -> 303627 (-0.29%) helped: 707 / HURT: 0 total cycles in shared programs: 433560149 -> 433471118 (-0.02%) cycles in affected programs: 19299650 -> 19210619 (-0.46%) helped: 687 / HURT: 395 LOST: 2 GAINED: 9 Sandy Bridge total instructions in shared programs: 13913386 -> 13912884 (<.01%) instructions in affected programs: 195687 -> 195185 (-0.26%) helped: 455 / HURT: 0 total cycles in shared programs: 741156272 -> 741136266 (<.01%) cycles in affected programs: 10934349 -> 10914343 (-0.18%) helped: 578 / HURT: 289 LOST: 9 GAINED: 4 Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8364056 -> 8364042 (<.01%) instructions in affected programs: 5178 -> 5164 (-0.27%) helped: 10 / HURT: 0 total cycles in shared programs: 248759794 -> 248757940 (<.01%) cycles in affected programs: 4305246 -> 4303392 (-0.04%) helped: 183 / HURT: 24 fossil-db results: Tiger Lake Instructions in all programs: 156943594 -> 156802601 (-0.1%) Instructions helped: 20595 Instructions hurt: 23248 Cycles in all programs: 7512086950 -> 7528386387 (+0.2%) Cycles helped: 29531 Cycles hurt: 27837 Spills in all programs: 13500 -> 5643 (-58.2%) Spills helped: 394 Spills hurt: 22 Fills in all programs: 18943 -> 6306 (-66.7%) Fills helped: 394 Fills hurt: 11 Gained: 93 Lost: 76 Ice Lake Instructions in all programs: 141395899 -> 141249621 (-0.1%) Instructions helped: 30067 Instructions hurt: 3 Cycles in all programs: 9097127057 -> 9089668235 (-0.1%) Cycles helped: 32268 Cycles hurt: 24315 Spills in all programs: 13695 -> 7564 (-44.8%) Spills helped: 403 Fills in all programs: 18400 -> 8494 (-53.8%) Fills helped: 403 Gained: 114 Lost: 137 Skylake Instructions in all programs: 131948328 -> 131826063 (-0.1%) Instructions helped: 29968 Instructions hurt: 3 Cycles in all programs: 8794778440 -> 8793934844 (-0.0%) Cycles helped: 32705 Cycles hurt: 23575 Spills in all programs: 10526 -> 7039 (-33.1%) Spills helped: 403 Fills in all programs: 11025 -> 7728 (-29.9%) Fills helped: 403 Gained: 102 Lost: 250 Tested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7698>	2023-08-29 19:01:36 +00:00
Ian Romanick	44d62a5224	intel/fs: Completely re-write the combine constants pass The is a squash of what in the original MR was "util: Add generic pass that tries to combine constants" and "intel/fs: Switch to using util_combine_constants". The new algorithm uses a multi-pass greedy algorithm that attempts to collect constants for loading in order of increasing degrees of freedom. The first pass collects constants that must be emitted as-is (e.g., without source modifiers). The second pass emits all constants that must be emitted (because they are used in a source field that cannot be a literal constant) but that can have a source modifier. The final pass possibly emits constants that may not have to be emitted. This is used for instructions where one of the fields is allowed to be a constant. This is not used in the current commit, but future commits that enable SEL will use this. The SEL instruction can have a single constant, but when both sources are constant, one of the sources has to be loaded into a register. By loading constants in this order, required "choices" made in earlier passes may be re-used in later passes. This provides a more optimal result. At this point in the series, most platforms have the same results with the new implementation. Gen7 platforms see a significant number of "small" changes. Due to the coissue optimization on Gen7, each shader is likely to have most constants affected by constant combining. If a shader has only a single basic block, constants are packed into registers in the order produced by the constant combining process. Since each constant has a different live range in the shader, even slightly different packing orders can have dramatic effects on the live range of a register. Even in cases where this does not affect register pressure in a meaningful way, it can cause the scheduler to make very different choices about the ordering of instructions. From my analysis (using the `if (debug) { ... }` block at the end of fs_visitor::opt_combine_constants), the old implementation and the new implementation pick the same set of constants, but the order produced may be slightly different. For the smaller number of values in non-Gfx7 shaders, the orders are similar enough to not matter. No shader-db or fossil-db changes on any non-Gfx7 platforms. Haswell and Ivy Bridge had similar results. (Haswell shown) total cycles in shared programs: 879930036 -> 880001666 (<.01%) cycles in affected programs: 22485040 -> 22556670 (0.32%) helped: 1879 HURT: 2309 helped stats (abs) min: 1 max: 6296 x̄: 258.54 x̃: 34 helped stats (rel) min: <.01% max: 54.63% x̄: 3.88% x̃: 0.87% HURT stats (abs) min: 1 max: 9739 x̄: 241.41 x̃: 40 HURT stats (rel) min: <.01% max: 160.50% x̄: 6.01% x̃: 0.99% 95% mean confidence interval for cycles value: -1.04 35.25 95% mean confidence interval for cycles %-change: 1.23% 1.92% Inconclusive result (value mean confidence interval includes 0). LOST: 82 GAINED: 39 Tested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7698>	2023-08-29 19:01:36 +00:00
Ian Romanick	9a9a86013c	intel/fs: Allow HF const in MAD on Gfx12.5 if all sources are HF Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23262>	2023-06-06 06:10:53 +00:00
Ian Romanick	4f272bf001	intel/fs: Fix handling of W, UW, and HF constants in combine_constants Sources that are already W, UW, or HF can be represented as those types by definition. Pass them through. Previously an HF source on a MAD would have been marked as !can_promote. I'm pretty sure this means it would get moved out to a register, but I did not verify this. For ADD3, a constant source could be D or UD. In this case, the value must be tested to determine whether it can be represented as W or UW. The patterns in opt_algebraic won't generate an ADD3 with constant source, so this problem cannot occur yet. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23262>	2023-06-06 06:10:53 +00:00
Ian Romanick	2016d9f46c	intel/fs: Rework the loop of opt_combine_constants that collects constants This is a bit more wordy, but it will greatly simplify some future changes. v2: Rebase on ADD3 changes. Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Tested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22274>	2023-04-03 21:50:06 +00:00
Ian Romanick	9e4bb4bfcf	intel/fs: Refactor part of opt_combine_constants to a separate function Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Tested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22274>	2023-04-03 21:50:06 +00:00
Ian Romanick	593cde0432	intel/fs: Output opt_combine_constants debug to stderr It's a lot more useful to have it in the same stream with the INTEL_DEBUG=fs output. Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Tested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22274>	2023-04-03 21:50:06 +00:00
Sagar Ghuge	0608e76e00	intel/compiler: Fix missing break in switch CoverityID: 1487496 Fixes: `cde9ca616d` "intel/compiler: Make decision based on source type instead of opcode" Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11985>	2021-07-22 23:38:04 +00:00
Sagar Ghuge	e6db2299a8	intel/compiler: Allow ternary add to promote source to immediate Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11596>	2021-07-16 15:59:56 +00:00
Sagar Ghuge	cde9ca616d	intel/compiler: Make decision based on source type instead of opcode This patch restructure code a little bit to check if source can be represented as immediate operand. This is a foundation for next patch which add checks for integer operand as well. Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11596>	2021-07-16 15:59:56 +00:00
Anuj Phogat	61e8636557	intel: Rename gen_device prefix to intel_device export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965" grep -E "gen_device" -rIl $SEARCH_PATH \| xargs sed -ie "s/gen_device/intel_device/g" Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10241>	2021-04-20 20:06:33 +00:00
Jordan Justen	262cb08557	intel/fs: Disable 3-src immediates on XeHP. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> [ Francisco Jerez: Add TODO comment explaining why this is helpful and how we could better fix it. ] Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10000>	2021-04-16 08:27:35 +00:00
Anuj Phogat	abe9a71a09	intel: Rename gen field in gen_device_info struct to ver Commands used to do the changes: export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965" grep -E "info\)(.\|->)gen" -rIl $SEARCH_PATH \| xargs sed -ie "s/info$)$$\.\\|->$gen/info\1\2ver/g" Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>	2021-04-02 18:33:07 +00:00
Francisco Jerez	c2a7eababf	intel/compiler: Move idom tree calculation and related logic into analysis object This only does half of the work. The actual representation of the idom tree is left untouched, but the computation algorithm is moved into a separate analysis result class wrapped in a BRW_ANALYSIS object, along with the intersect() and dump_domtree() auxiliary functions in order to keep things tidy. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4012>	2020-03-06 10:21:03 -08:00
Francisco Jerez	ab6d792986	intel/compiler: Pass detailed dependency classes to invalidate_analysis() Have fun reading through the whole back-end optimizer to verify whether I've missed any dependency flags -- Or alternatively, just trust that any mistake here will trigger an assertion failure during analysis pass validation if it ever poses a problem for the consistency of any of the analysis passes managed by the framework. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4012>	2020-03-06 10:20:39 -08:00
Francisco Jerez	d966a6b4c4	intel/compiler: Introduce backend_shader method to propagate IR changes to analysis passes The invalidate_analysis() method knows what analysis passes there are in the back-end and calls their invalidate() method to report changes in the IR. For the moment it just calls invalidate_live_intervals() (which will eventually be fully replaced by this function) if anything changed. This makes all optimization passes invalidate DEPENDENCY_EVERYTHING, which is clearly far from ideal -- The dependency classes passed to invalidate_analysis() will be refined in a future commit. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4012>	2020-03-06 10:20:32 -08:00
Ian Romanick	59488cbbac	intel/fs: Don't count integer instructions as being possibly coissue Integer instructions don't coissue. Before `e64be391dd` ("intel/compiler: generalize the combine constants pass"), this pass only looked at float sources. There's no shader-db data in that commit, so I collected some. The results are not good: Haswell total instructions in shared programs: 11898805 -> 11908127 (0.08%) instructions in affected programs: 1218680 -> 1228002 (0.76%) helped: 2 HURT: 5171 helped stats (abs) min: 12 max: 111 x̄: 61.50 x̃: 61 helped stats (rel) min: 1.59% max: 9.20% x̄: 5.40% x̃: 5.40% HURT stats (abs) min: 1 max: 311 x̄: 1.83 x̃: 1 HURT stats (rel) min: 0.02% max: 9.91% x̄: 1.05% x̃: 0.70% 95% mean confidence interval for instructions value: 1.55 2.05 95% mean confidence interval for instructions %-change: 1.02% 1.08% Instructions are HURT. total cycles in shared programs: 221664974 -> 221404750 (-0.12%) cycles in affected programs: 120012620 -> 119752396 (-0.22%) helped: 3464 HURT: 3159 helped stats (abs) min: 1 max: 428160 x̄: 314.55 x̃: 16 helped stats (rel) min: <.01% max: 57.33% x̄: 3.40% x̃: 1.28% HURT stats (abs) min: 1 max: 87846 x̄: 262.54 x̃: 14 HURT stats (rel) min: <.01% max: 85.57% x̄: 3.01% x̃: 0.77% 95% mean confidence interval for cycles value: -224.23 145.65 95% mean confidence interval for cycles %-change: -0.50% -0.19% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 9804 -> 10047 (2.48%) spills in affected programs: 6869 -> 7112 (3.54%) helped: 2 HURT: 41 total fills in shared programs: 19863 -> 20319 (2.30%) fills in affected programs: 17428 -> 17884 (2.62%) helped: 2 HURT: 41 LOST: 20 GAINED: 13 This also prevents regressions in "intel/fs: Promote integer constants after lowering integer multiplication" (note: that patch will probably not be committed). When the passes are reorderd, code like mul(8) acc0<1>D g9<8,8,1>D -2078209981D { align1 1Q }; gets turned into mov(1) g23<1>D 2078209981D { align1 WE_all 1N }; ... mul(8) acc0<1>D g13<8,8,1>D -g23<0,1,0>D { align1 1Q compacted }; It's not 100% clear why, but these produce different results. Note that -2078209981 & 0x0ffff = 0x0843, and -(2078209981 & 0x0ffff) = 0xffff0843. It seems like the upper 16-bits of the negation should be ignored. Fixes: `e64be391dd` ("intel/compiler: generalize the combine constants pass") Cc: Iago Toral Quiroga <itoral@igalia.com> Suggested-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> The shaders with spills or fills hurt are the usual suspects. A couple compute shaders in Dirt Showdown and a compute shader in Bioshock Infinite. On Haswell, a compute shader (that appears twice in shader-db) from Aztec Ruins was also hurt for spill and fills. Haswell total instructions in shared programs: 11573934 -> 11568335 (-0.05%) instructions in affected programs: 828623 -> 823024 (-0.68%) helped: 2825 HURT: 6 helped stats (abs) min: 1 max: 134 x̄: 2.16 x̃: 1 helped stats (rel) min: 0.02% max: 9.05% x̄: 0.84% x̃: 0.61% HURT stats (abs) min: 1 max: 216 x̄: 81.83 x̃: 56 HURT stats (rel) min: 0.16% max: 8.65% x̄: 4.21% x̃: 4.68% 95% mean confidence interval for instructions value: -2.31 -1.64 95% mean confidence interval for instructions %-change: -0.85% -0.80% Instructions are helped. total cycles in shared programs: 187573593 -> 187004633 (-0.30%) cycles in affected programs: 82816107 -> 82247147 (-0.69%) helped: 2186 HURT: 1741 helped stats (abs) min: 1 max: 35230 x̄: 326.96 x̃: 16 helped stats (rel) min: <.01% max: 46.11% x̄: 3.11% x̃: 0.90% HURT stats (abs) min: 1 max: 6138 x̄: 83.73 x̃: 16 HURT stats (rel) min: <.01% max: 104.11% x̄: 2.73% x̃: 0.75% 95% mean confidence interval for cycles value: -197.13 -92.64 95% mean confidence interval for cycles %-change: -0.72% -0.33% Cycles are helped. total spills in shared programs: 7870 -> 7743 (-1.61%) spills in affected programs: 2260 -> 2133 (-5.62%) helped: 31 HURT: 5 total fills in shared programs: 6320 -> 6263 (-0.90%) fills in affected programs: 3547 -> 3490 (-1.61%) helped: 31 HURT: 6 LOST: 9 GAINED: 9 Ivybridge total instructions in shared programs: 11863372 -> 11859793 (-0.03%) instructions in affected programs: 757183 -> 753604 (-0.47%) helped: 2236 HURT: 3 helped stats (abs) min: 1 max: 81 x̄: 1.86 x̃: 1 helped stats (rel) min: 0.03% max: 5.26% x̄: 0.74% x̃: 0.48% HURT stats (abs) min: 11 max: 301 x̄: 192.33 x̃: 265 HURT stats (rel) min: 1.55% max: 10.51% x̄: 6.89% x̃: 8.62% 95% mean confidence interval for instructions value: -2.01 -1.18 95% mean confidence interval for instructions %-change: -0.77% -0.70% Instructions are helped. total cycles in shared programs: 178377378 -> 177946087 (-0.24%) cycles in affected programs: 76261390 -> 75830099 (-0.57%) helped: 1635 HURT: 1395 helped stats (abs) min: 1 max: 34796 x̄: 333.53 x̃: 16 helped stats (rel) min: <.01% max: 47.15% x̄: 2.82% x̃: 0.64% HURT stats (abs) min: 1 max: 4315 x̄: 81.74 x̃: 18 HURT stats (rel) min: <.01% max: 49.98% x̄: 1.99% x̃: 0.53% 95% mean confidence interval for cycles value: -197.06 -87.62 95% mean confidence interval for cycles %-change: -0.78% -0.43% Cycles are helped. total spills in shared programs: 4188 -> 4182 (-0.14%) spills in affected programs: 1557 -> 1551 (-0.39%) helped: 30 HURT: 3 total fills in shared programs: 5056 -> 5245 (3.74%) fills in affected programs: 2708 -> 2897 (6.98%) helped: 30 HURT: 3 LOST: 5 GAINED: 1 No shader-db changes on any other Intel platform. Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3544> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3544>	2020-02-05 15:13:17 +00:00
Sagar Ghuge	18b28b5654	intel/compiler: Don't move immediate in register On Gen12, we support mixed mode HF/F operands, and also 3 source instruction supports immediate value support, so keep immediate as it is, if it fits properly in 16 bit field. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-10-21 20:32:43 -07:00
Maya Rashish	e16fadd545	intel/compiler: avoid truncating int64_t to int Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Maya Rashish <maya@netbsd.org>	2019-09-26 17:46:26 +00:00

1 2

56 commits