fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-22 20:00:10 +01:00

Author	SHA1	Message	Date
Sushma Venkatesh Reddy	4084527876	intel/compiler: Always run opt_algebraic after descriptor_lowering Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This change ensures that `brw_opt_algebraic` is always executed after `brw_lower_send_descriptors` in `brw_opt.cpp`. By doing so, redundant logical operations are optimized, resulting in cleaner and more compact assembly output. fossil-db results on LNL: - Totals: - Instructions: 215857290 -> 215857028 (-0.00%) - Cycle count: 32008929636 -> 32008935384 (+0.00%); split: -0.00%, +0.00% - Max live registers: 66940643 -> 66940557 (-0.00%) - Affected shaders (104 out of 713963): - Instructions: 31090 -> 30828 (-0.84%) - Cycle count: 5955908 -> 5961656 (+0.10%); split: -0.16%, +0.26% - Max live registers: 10888 -> 10802 (-0.79%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34615>	2025-04-19 07:05:54 +00:00
Lionel Landwerlin	06ad9a25e5	brw: fix Wa_22013689345 emission Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details 2 problems : - not detecting null destination correctly - applied too late using SHADER_OPCODE_MEMORY_FENCE, when lowering already happened Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34319>	2025-04-10 16:44:28 +00:00
Ian Romanick	20cce95ce5	brw/opt: Don't call brw_opt_copy_propagation before brw_lower_load_reg Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details On a 36c/72t Xeon system, performance of replaying hogwarts_legacy.dx12vk-ultra.foz was improved 1.3% +/- 0.77% (n=10). I picked MTL for the fossil-db results because it was the most negative. shader-db: All Intel platforms had fairly similar results. (Lunar Lake) total instructions in shared programs: 16964217 -> 16964216 (<.01%) instructions in affected programs: 51777 -> 51776 (<.01%) helped: 20 / HURT: 27 total cycles in shared programs: 892934916 -> 893041912 (0.01%) cycles in affected programs: 51245298 -> 51352294 (0.21%) helped: 96 /HURT: 78 fossil-db: All Intel platforms had similar results. (Meteor Lake shown) Totals: Instrs: 233678547 -> 233678944 (+0.00%); split: -0.00%, +0.00% Cycle count: 24398049850 -> 24400490877 (+0.01%); split: -0.01%, +0.02% Max live registers: 42145052 -> 42145038 (-0.00%); split: -0.00%, +0.00% Totals from 1141 (0.14% of 805934) affected shaders: Instrs: 1546001 -> 1546398 (+0.03%); split: -0.01%, +0.03% Cycle count: 1201746062 -> 1204187089 (+0.20%); split: -0.14%, +0.34% Max live registers: 84247 -> 84233 (-0.02%); split: -0.03%, +0.01% Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>	2025-04-04 06:45:02 +00:00
Ian Romanick	2d13acf9d9	brw: Add passes to generate and lower load_reg v2: Add support for WE_all instructions... this already just worked, so I only had to delete the check and the FINISHME comment. v3: Use logic more like def_analysis::update_for_reads to determine when to not insert LOAD_REG instructions. Based on a suggestion by Ken. v4: Eliminate "store" from all the names since STORE_REG does not exist anymore. Fold insert_load_reg into brw_insert_load_reg. Elminate extra call to s.def_analysis.require() after progress. Pull a loop-invariant check out of the inst->srouces loop. Drop call to brw_opt_split_virtual_grfs after lowering load_reg. All suggested by Caio. v5: Assert that LOAD_REG doesn't already exist in brw_insert_load_reg. Update comment before fully_defines. Both suggested by Caio. v6: Don't explicitly special-case SHADER_OPCODE_MEMORY_STORE_LOGICAL. Move the inst->dst.file != VGRF check earlier to avoid the loop over sources. Both suggested by Ken. Move the call the brw_insert_load_reg a little bit later, and explain why it's at that location. Suggested by Caio. v7: Many changes to the for-each-source loop in brw_insert_load_reg. Removes incorrect multiplication of s.alloc.sizes with reg_unit. Adds checks for matching SIMD size and NoMask in the search for pre-existing LOAD_REG of same value. v8: Add some unit tests. Suggested by Caio. shader-db: Lunar Lake total instructions in shared programs: 16923237 -> 16921895 (<.01%) instructions in affected programs: 450565 -> 449223 (-0.30%) helped: 251 / HURT: 377 total cycles in shared programs: 910428418 -> 889920590 (-2.25%) cycles in affected programs: 719248184 -> 698740356 (-2.85%) helped: 9076 / HURT: 9082 total fills in shared programs: 2242 -> 2218 (-1.07%) fills in affected programs: 116 -> 92 (-20.69%) helped: 2 / HURT: 0 total sends in shared programs: 848635 -> 848421 (-0.03%) sends in affected programs: 810 -> 596 (-26.42%) helped: 10 / HURT: 0 LOST: 82 GAINED: 78 Meteor Lake and DG2 had similar results. (Meteor Lake shown) total instructions in shared programs: 19875784 -> 19871694 (-0.02%) instructions in affected programs: 1050091 -> 1046001 (-0.39%) helped: 251 / HURT: 2403 total cycles in shared programs: 905328238 -> 882446458 (-2.53%) cycles in affected programs: 682736344 -> 659854564 (-3.35%) helped: 7869 / HURT: 7911 total spills in shared programs: 5512 -> 5032 (-8.71%) spills in affected programs: 1830 -> 1350 (-26.23%) helped: 8 / HURT: 0 total fills in shared programs: 5648 -> 4782 (-15.33%) fills in affected programs: 3312 -> 2446 (-26.15%) helped: 8 / HURT: 0 total sends in shared programs: 1032942 -> 1032722 (-0.02%) sends in affected programs: 572 -> 352 (-38.46%) helped: 10 / HURT: 0 LOST: 138 GAINED: 53 Tiger Lake total instructions in shared programs: 19711930 -> 19715591 (0.02%) instructions in affected programs: 1040623 -> 1044284 (0.35%) helped: 317 / HURT: 2474 total cycles in shared programs: 862988990 -> 860573870 (-0.28%) cycles in affected programs: 612392461 -> 609977341 (-0.39%) helped: 7447 / HURT: 7686 total sends in shared programs: 1034763 -> 1034555 (-0.02%) sends in affected programs: 784 -> 576 (-26.53%) helped: 8 / HURT: 0 LOST: 56 GAINED: 143 Ice Lake and Skylake had similar results. (Ice Lake shown) total instructions in shared programs: 20545461 -> 20545220 (<.01%) instructions in affected programs: 422405 -> 422164 (-0.06%) helped: 180 / HURT: 459 total cycles in shared programs: 872697345 -> 866874523 (-0.67%) cycles in affected programs: 573117917 -> 567295095 (-1.02%) helped: 6783 / HURT: 6980 total spills in shared programs: 4335 -> 4336 (0.02%) spills in affected programs: 90 -> 91 (1.11%) helped: 1 / HURT: 2 total fills in shared programs: 4194 -> 4196 (0.05%) fills in affected programs: 463 -> 465 (0.43%) helped: 1 / HURT: 2 total sends in shared programs: 1079446 -> 1079238 (-0.02%) sends in affected programs: 784 -> 576 (-26.53%) helped: 8 / HURT: 0 LOST: 117 GAINED: 37 fossil-db: All Intel platforms had similar results. (Lunar Lake shown) Totals: Instrs: 209708136 -> 209695617 (-0.01%); split: -0.02%, +0.01% Send messages: 10927753 -> 10927640 (-0.00%) Cycle count: 30540172048 -> 30427084732 (-0.37%); split: -0.99%, +0.62% Spill count: 511621 -> 510932 (-0.13%); split: -0.22%, +0.08% Fill count: 621166 -> 618440 (-0.44%); split: -0.56%, +0.12% Scratch Memory Size: 35574784 -> 35648512 (+0.21%); split: -0.06%, +0.26% Max live registers: 65453860 -> 65453140 (-0.00%); split: -0.00%, +0.00% Non SSA regs after NIR: 75374990 -> 35195764 (-53.31%) Totals from 503284 (71.25% of 706391) affected shaders: Instrs: 180203778 -> 180191259 (-0.01%); split: -0.02%, +0.01% Send messages: 9699732 -> 9699619 (-0.00%) Cycle count: 30080349592 -> 29967262276 (-0.38%); split: -1.01%, +0.63% Spill count: 511584 -> 510895 (-0.13%); split: -0.22%, +0.08% Fill count: 621120 -> 618394 (-0.44%); split: -0.56%, +0.12% Scratch Memory Size: 35443712 -> 35517440 (+0.21%); split: -0.06%, +0.27% Max live registers: 52566092 -> 52565372 (-0.00%); split: -0.01%, +0.00% Non SSA regs after NIR: 70110949 -> 29931723 (-57.31%) Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>	2025-04-04 06:45:02 +00:00
Ian Romanick	b9656d51c0	brw/opt: Move non-SSA register accounting after first brw_opt_split_virtual_grfs v2: Move to immediately before the main optimization loop. Most importantly, this is after the first call to DCE. fossil-db: All Intel platforms had similar results. (Lunar Lake shown) Totals: Non SSA regs after NIR: 237045283 -> 100183460 (-57.74%); split: -58.12%, +0.39% Totals from 701423 (99.26% of 706657) affected shaders: Non SSA regs after NIR: 236868848 -> 100007025 (-57.78%); split: -58.17%, +0.39% Suggested-by: Ken Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>	2025-04-04 06:45:02 +00:00
Caio Oliveira	308f56ef82	brw: Add missing dependency classes to various passes Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details - brw_lower_3src_null_dest: Allocating a new destination, so include INSTRUCTION_DATA_FLOW class. - brw_lower_alu_restriction: Removing instruction, so include INSTRUCTION_IDENTITY. No details are changed so remove INSTRUCTION_DETAIL. - brw_lower_vgrfs_to_fixed_grfs: Changing source and destination numbers, so include INSTRUCTION_DETAIL. - brw_lower_send_gather: Insert new instructions (scalar register) and change sources and other information on existing ones. So include INSTRUCTION_DETAIL and INSTRUCTION_IDENTITY. Promote to INSTRUCTIONS. - brw_opt_eliminate_find_live_channel: Can change source, so include INSTRUCTION_DATA_FLOW. - brw_opt_copy_propagation_defs and brw_opt_cse_defs: Both can remove instructions, so include INSTRUCTION_IDENTITY. Promote to INSTRUCTIONS. - brw_opt_saturate_propagation: Instruction can have `sat` modified, and operands can have type modified, so include INSTRUCTION_DETAIL. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33993>	2025-03-12 22:44:10 +00:00
Caio Oliveira	32e562ae01	brw: Simplify brw_builder "insert before inst" constructor Since brw_inst now has the block it belongs and the block can reach the shader, the only necessary information to create a builder is the brw_inst itself. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33815>	2025-03-06 23:33:38 +00:00
Caio Oliveira	66307811c3	brw: Remove block parameter from brw_inst::remove() Use brw_inst::block instead. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33815>	2025-03-06 23:33:38 +00:00
Kenneth Graunke	88309a9818	brw: Rename shared function enums for clarity Our name for this enum was brw_message_target, but it's better known as shared function ID or SFID. Call it brw_sfid to make it easier to find. Now that brw only supports Gfx9+, we don't particularly care whether SFIDs were introduced on Gfx4, Gfx6, or Gfx7.5. Also, the LSC SFIDs were confusingly tagged "GFX12" but aren't available on Gfx12.0; they were introduced with Alchemist/Meteorlake. GFX6_SFID_DATAPORT_SAMPLER_CACHE in particular was confusing. It sounds like the SFID to use for the sampler on Gfx6+, however it has nothing to do with the sampler at all. BRW_SFID_SAMPLER remains the sampler SFID. On Haswell, we ran out of messages on the main data cache data port, and so they introduced two additional ones, for more messages. The modern Tigerlake PRMs simply call these DP_DC0, DP_DC1, and DP_DC2. I think the "sampler" name came from some idea about reorganizing messages that never materialized (instead, the LSC came as a much larger cleanup). Recently we've adopted the term "HDC" for the legacy data cluster, as opposed to "LSC" for the modern Load/Store Cache. To make clear which SFIDs target the legacy HDC dataports, we use BRW_SFID_HDC0/1/2. We were also citing the G45, Sandybridge, and Ivybridge PRMs for a compiler that supports none of those platforms. Cite modern docs. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33650>	2025-02-27 08:49:24 +00:00
Caio Oliveira	ff44f4d278	intel/brw: Update outdated comments Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32536>	2025-02-11 09:13:28 +00:00
Caio Oliveira	cf3bb77224	intel/brw: Rename fs_visitor to brw_shader Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32536>	2025-02-11 09:13:28 +00:00
Caio Oliveira	352a63122f	intel/brw: Rename files brw_fs.cpp/h to brw_shader.cpp/h Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32536>	2025-02-11 09:13:28 +00:00
Caio Oliveira	f82bcd56fc	intel/brw: Add functions to allocate VGRF space Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33334>	2025-02-06 08:33:03 -08:00
Caio Oliveira	ea87bab4ce	intel/brw: Remove 'using namespace brw' directives Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33418>	2025-02-06 07:58:55 -08:00
Caio Oliveira	1ade9a05d8	intel/brw: Use brw prefix instead of namespace for analysis implementations Also drop the 'fs' prefix when applicable. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33048>	2025-02-05 21:47:07 +00:00
Caio Oliveira	2b92eb0b2c	intel/brw: Use brw prefix instead of namespace for dep analysis enum Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33048>	2025-02-05 21:47:07 +00:00
Caio Oliveira	d59bd421a2	intel/brw: Rename fs_inst to brw_inst Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33114>	2025-01-31 00:57:21 +00:00
Caio Oliveira	f18dee3618	intel/brw: Fallback to SEND from SEND_GATHER if possible After optimization happen, if the sources are still in one or two contigous spans for some reason (e.g. some data read from memory now being written), it is beneficial to just use regular SEND and avoid having to set the ARF scalar instruction. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <None> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32410>	2025-01-30 04:43:58 +00:00
Caio Oliveira	b6b32933ad	intel/brw: Use SHADER_OPCODE_SEND_GATHER in Xe3 Add an optimization pass to turn regular SENDs into SEND_GATHERs. This allows the payload to be "broken" into smaller pieces that can be further optimized, which _may_ result in - less register pressure (no need to contiguous space), and - less instructions (no need to MOV to such space). For debugging, the INTEL_DEBUG=no-send-gather option skips this optimization, and reporting how many opportunities were missed. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <None> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32410>	2025-01-30 04:43:58 +00:00
Caio Oliveira	5ac82efd35	intel/brw: Rename fs_builder to brw_builder Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33076>	2025-01-18 16:12:55 +00:00
Caio Oliveira	f2d4c9db92	intel/brw: Rename brw_fs_builder.h to brw_builder.h Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33076>	2025-01-18 16:12:54 +00:00
Caio Oliveira	634daf2827	intel/brw: Rename brw_fs_validate to brw_validate Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32843>	2025-01-13 23:56:22 +00:00
Lionel Landwerlin	8ac7802ac8	brw: move final send lowering up into the IR Because we do emit the final send message form in code generation, a lot of emissions look like this : add(8) vgrf0, u0, 0x100 mov(1) a0.1, vgrf0 # emitted by the generator send(8) ..., a0.1 By moving address register manipulation in the IR, we can get this down to : add(1) a0.1, u0, 0x100 send(8) ..., a0.1 This reduce register pressure around some send messages by 1 vgrf. All lost shaders in the below results are fragment SIMD32, due to the throughput estimator. If turned off, we loose no SIMD32 shaders with this change. DG2 results: Assassin's Creed Valhalla: Totals from 2044 (96.87% of 2110) affected shaders: Instrs: 852879 -> 832044 (-2.44%); split: -2.45%, +0.00% Subgroup size: 23832 -> 23824 (-0.03%) Cycle count: 53345742 -> 52144277 (-2.25%); split: -5.08%, +2.82% Spill count: 729 -> 554 (-24.01%); split: -28.40%, +4.39% Fill count: 2005 -> 1256 (-37.36%) Scratch Memory Size: 25600 -> 19456 (-24.00%); split: -32.00%, +8.00% Max live registers: 116765 -> 115058 (-1.46%) Max dispatch width: 19152 -> 18872 (-1.46%); split: +0.21%, -1.67% Cyberpunk 2077: Totals from 1181 (93.43% of 1264) affected shaders: Instrs: 667192 -> 663615 (-0.54%); split: -0.55%, +0.01% Subgroup size: 13016 -> 13032 (+0.12%) Cycle count: 17383539 -> 17986073 (+3.47%); split: -0.93%, +4.39% Spill count: 12 -> 8 (-33.33%) Fill count: 9 -> 6 (-33.33%) Dota2: Totals from 173 (11.59% of 1493) affected shaders: Cycle count: 274403 -> 280817 (+2.34%); split: -0.01%, +2.34% Max live registers: 5787 -> 5779 (-0.14%) Max dispatch width: 1344 -> 1152 (-14.29%) Hitman3: Totals from 5072 (95.39% of 5317) affected shaders: Instrs: 2879952 -> 2841804 (-1.32%); split: -1.32%, +0.00% Cycle count: 153208505 -> 165860401 (+8.26%); split: -2.22%, +10.48% Spill count: 3942 -> 3200 (-18.82%) Fill count: 10158 -> 8846 (-12.92%) Scratch Memory Size: 257024 -> 223232 (-13.15%) Max live registers: 328467 -> 324631 (-1.17%) Max dispatch width: 43928 -> 42768 (-2.64%); split: +0.09%, -2.73% Fortnite: Totals from 360 (4.82% of 7472) affected shaders: Instrs: 778068 -> 777925 (-0.02%) Subgroup size: 3128 -> 3136 (+0.26%) Cycle count: 38684183 -> 38734579 (+0.13%); split: -0.06%, +0.19% Max live registers: 50689 -> 50658 (-0.06%) Hogwarts Legacy: Totals from 1376 (84.00% of 1638) affected shaders: Instrs: 758810 -> 749727 (-1.20%); split: -1.23%, +0.03% Cycle count: 27778983 -> 28805469 (+3.70%); split: -1.42%, +5.12% Spill count: 2475 -> 2299 (-7.11%); split: -7.47%, +0.36% Fill count: 2677 -> 2445 (-8.67%); split: -9.90%, +1.23% Scratch Memory Size: 99328 -> 89088 (-10.31%) Max live registers: 84969 -> 84671 (-0.35%); split: -0.58%, +0.23% Max dispatch width: 11848 -> 11920 (+0.61%) Metro Exodus: Totals from 92 (0.21% of 43072) affected shaders: Instrs: 262995 -> 262968 (-0.01%) Cycle count: 13818007 -> 13851266 (+0.24%); split: -0.01%, +0.25% Max live registers: 11152 -> 11140 (-0.11%) Red Dead Redemption 2 : Totals from 451 (7.71% of 5847) affected shaders: Instrs: 754178 -> 753811 (-0.05%); split: -0.05%, +0.00% Cycle count: 3484078523 -> 3484111965 (+0.00%); split: -0.00%, +0.00% Max live registers: 42294 -> 42185 (-0.26%) Spiderman Remastered: Totals from 6820 (98.02% of 6958) affected shaders: Instrs: 6921500 -> 6747933 (-2.51%); split: -4.16%, +1.65% Cycle count: 234400692460 -> 236846720707 (+1.04%); split: -0.20%, +1.25% Spill count: 72971 -> 72622 (-0.48%); split: -8.08%, +7.61% Fill count: 212921 -> 198483 (-6.78%); split: -12.37%, +5.58% Scratch Memory Size: 3491840 -> 3410944 (-2.32%); split: -12.05%, +9.74% Max live registers: 493149 -> 487458 (-1.15%) Max dispatch width: 56936 -> 56856 (-0.14%); split: +0.06%, -0.20% Strange Brigade: Totals from 3769 (91.21% of 4132) affected shaders: Instrs: 1354476 -> 1321474 (-2.44%) Cycle count: 25351530 -> 25339190 (-0.05%); split: -1.64%, +1.59% Max live registers: 199057 -> 193656 (-2.71%) Max dispatch width: 30272 -> 30240 (-0.11%) Witcher 3: Totals from 25 (2.40% of 1041) affected shaders: Instrs: 24621 -> 24606 (-0.06%) Cycle count: 2218793 -> 2217503 (-0.06%); split: -0.11%, +0.05% Max live registers: 1963 -> 1955 (-0.41%) LNL results: Assassin's Creed Valhalla: Totals from 1928 (98.02% of 1967) affected shaders: Instrs: 856107 -> 835756 (-2.38%); split: -2.48%, +0.11% Subgroup size: 41264 -> 41280 (+0.04%) Cycle count: 64606590 -> 62371700 (-3.46%); split: -5.57%, +2.11% Spill count: 915 -> 669 (-26.89%); split: -32.79%, +5.90% Fill count: 2414 -> 1617 (-33.02%); split: -36.62%, +3.60% Scratch Memory Size: 62464 -> 44032 (-29.51%); split: -36.07%, +6.56% Max live registers: 205483 -> 202192 (-1.60%) Cyberpunk 2077: Totals from 1177 (96.40% of 1221) affected shaders: Instrs: 682237 -> 678931 (-0.48%); split: -0.51%, +0.03% Subgroup size: 24912 -> 24944 (+0.13%) Cycle count: 24355928 -> 25089292 (+3.01%); split: -0.80%, +3.81% Spill count: 8 -> 3 (-62.50%) Fill count: 6 -> 3 (-50.00%) Max live registers: 126922 -> 125472 (-1.14%) Dota2: Totals from 428 (32.47% of 1318) affected shaders: Instrs: 89355 -> 89740 (+0.43%) Cycle count: 1152412 -> 1152706 (+0.03%); split: -0.52%, +0.55% Max live registers: 32863 -> 32847 (-0.05%) Fortnite: Totals from 5354 (81.72% of 6552) affected shaders: Instrs: 4135059 -> 4239015 (+2.51%); split: -0.01%, +2.53% Cycle count: 132557506 -> 132427302 (-0.10%); split: -0.75%, +0.65% Spill count: 7144 -> 7234 (+1.26%); split: -0.46%, +1.72% Fill count: 12086 -> 12403 (+2.62%); split: -0.73%, +3.35% Scratch Memory Size: 600064 -> 604160 (+0.68%); split: -1.02%, +1.71% Hitman3: Totals from 4912 (97.09% of 5059) affected shaders: Instrs: 2952124 -> 2916824 (-1.20%); split: -1.20%, +0.00% Cycle count: 179985656 -> 189175250 (+5.11%); split: -2.44%, +7.55% Spill count: 3739 -> 3136 (-16.13%) Fill count: 10657 -> 9564 (-10.26%) Scratch Memory Size: 373760 -> 318464 (-14.79%) Max live registers: 597566 -> 589460 (-1.36%) Hogwarts Legacy: Totals from 1471 (96.33% of 1527) affected shaders: Instrs: 748749 -> 766214 (+2.33%); split: -0.71%, +3.05% Cycle count: 33301528 -> 34426308 (+3.38%); split: -1.30%, +4.68% Spill count: 3278 -> 3070 (-6.35%); split: -8.30%, +1.95% Fill count: 4553 -> 4097 (-10.02%); split: -10.85%, +0.83% Scratch Memory Size: 251904 -> 217088 (-13.82%) Max live registers: 168911 -> 168106 (-0.48%); split: -0.59%, +0.12% Metro Exodus: Totals from 18356 (49.81% of 36854) affected shaders: Instrs: 7559386 -> 7621591 (+0.82%); split: -0.01%, +0.83% Cycle count: 195240612 -> 196455186 (+0.62%); split: -1.22%, +1.84% Spill count: 595 -> 546 (-8.24%) Fill count: 1604 -> 1408 (-12.22%) Max live registers: 2086937 -> 2086933 (-0.00%) Red Dead Redemption 2: Totals from 4171 (79.31% of 5259) affected shaders: Instrs: 2619392 -> 2719587 (+3.83%); split: -0.00%, +3.83% Subgroup size: 86416 -> 86432 (+0.02%) Cycle count: 8542836160 -> 8531976886 (-0.13%); split: -0.65%, +0.53% Fill count: 12949 -> 12970 (+0.16%); split: -0.43%, +0.59% Scratch Memory Size: 401408 -> 385024 (-4.08%) Spiderman Remastered: Totals from 6639 (98.94% of 6710) affected shaders: Instrs: 6877980 -> 6800592 (-1.13%); split: -3.11%, +1.98% Cycle count: 282183352210 -> 282100051824 (-0.03%); split: -0.62%, +0.59% Spill count: 63147 -> 64218 (+1.70%); split: -7.12%, +8.82% Fill count: 184931 -> 175591 (-5.05%); split: -10.81%, +5.76% Scratch Memory Size: 5318656 -> 5970944 (+12.26%); split: -5.91%, +18.17% Max live registers: 918240 -> 906604 (-1.27%) Strange Brigade: Totals from 3675 (92.24% of 3984) affected shaders: Instrs: 1462231 -> 1429345 (-2.25%); split: -2.25%, +0.00% Cycle count: 37404050 -> 37345292 (-0.16%); split: -1.25%, +1.09% Max live registers: 361849 -> 351265 (-2.92%) Witcher 3: Totals from 13 (46.43% of 28) affected shaders: Instrs: 593 -> 660 (+11.30%) Cycle count: 28302 -> 28714 (+1.46%) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28199>	2025-01-11 08:41:42 +00:00
Kenneth Graunke	4ab04799ee	brw: Delete assign_constant_locations and push_constant_loc[] The push_constant_loc[] array is always an identity mapping these days, so it's kind of pointless. Just use the original uniform number and skip the unnecessary "remap" step. With that gone, and shrinking UBO ranges gone, assign_constant_locations() is now empty and can be removed as well. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32841>	2025-01-06 12:45:47 +00:00
Kenneth Graunke	93e186e1a4	brw: Delete pull constant lowering Now that we never shrink ranges in the backend, we never lower push constants to pull constants late in the backend either. get_pull_loc will never return true, and so all of brw_lower_constant_loads becomes a noop. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32841>	2025-01-06 12:45:47 +00:00
Caio Oliveira	e1aebf8a0c	intel/brw: Remove 'fs' prefix from passes and related functions Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32813>	2025-01-02 18:11:05 +00:00
Caio Oliveira	25384dccc0	intel/brw: Remove 'fs' prefix from passes filenames Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32813>	2025-01-02 18:11:05 +00:00

27 commits