fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-22 15:40:11 +01:00

Author	SHA1	Message	Date
Kenneth Graunke	1e69ec3b8d	intel/brw: Add a lower_csel pass and allow building it for all types We can do CSEL on F, HF, W, and D on Gfx11+. Gfx9 can only do F. We can lower unsupported types to CMP+CSEL, allowing us to use CSEL in the IR and not worry about the limitations. Rework: (Sagar) - Update validation pass for CSEL Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29316>	2024-07-01 19:06:31 +00:00
Dylan Baker	35298e84f1	intel/compiler: move predicated_break out of backend loop This has no impact on the generated shaders, but does have a small (positive) impact on the amount of time spent in shader compilation. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29126>	2024-06-27 15:20:19 -07:00
Kenneth Graunke	2af84c2d49	intel/brw: Use the defs-based copy propagation along with the old one The new def-based pass works better in many cases, and should be less resource intensive. However, the limited visibility of the defs-based pass due to many values not being SSA yet makes it unable to fully replace the old pass. Try the new one, and if it can't make progress, then try the old one. That way, things will mostly be handled by the new pass, but everything that was being cleaned up still will be. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28666>	2024-06-18 09:02:25 +00:00
Kenneth Graunke	8f09c58ddc	intel/brw: Switch to the new defs-based global CSE pass While the limited visibility due to partial SSA is a downside to the new pass, it has a huge number of advantages that make it worth switching over even now. It's much more efficient, can eliminate redundant memory loads across blocks, and doesn't generate loads of unnecessary copies that other passes have to clean up. This means we also eliminate the infighting between the old CSE, coalescing, and copy propagation passes. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28666>	2024-06-18 09:02:25 +00:00
Kenneth Graunke	84219892ad	intel/brw: Make gl_SubgroupInvocation lane index loading SSA Our code to initialize gl_SubgroupInvocation uses multiple instructions some of which are partial writes. This makes it difficult to analyze expressions involving gl_SubgroupInvocation, which appear very frequently in compute shaders. To make this easier, we add a new virtual opcode which initializes a full VGRF to the value of gl_SubgroupInvocation. (We also expand it to UD for SIMD8 so there are not partial write issues.) We then lower it to the original code later on in compilation, after we've done the bulk of our optimizations. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28666>	2024-06-18 09:02:25 +00:00
Kenneth Graunke	545bb8fb6f	intel/brw: Replace type_sz and brw_reg_type_to_size with brw_type_size_* Both of these helpers do the same thing. We now have brw_type_size_bits and brw_type_size_bytes and can use whichever makes sense in that place. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>	2024-04-25 11:41:48 +00:00
Caio Oliveira	13093ceb3c	intel/brw: Move validate out of fs_visitor Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28534>	2024-04-22 13:38:41 -07:00
Caio Oliveira	671d216f39	intel/brw: Remove two duplicated validate calls in optimizer The OPT macro will call validate() after each pass, so both cases removed by this patch are just redundant calls. Will only affect Debug builds since in Release builds validation is a no-op. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28534>	2024-04-22 13:38:41 -07:00
Kenneth Graunke	ba11127944	intel/brw: Fix opt_split_sends() to allow for FIXED_GRF send sources opt_copy_propagation() can sometimes propagate FIXED_GRF sources into SHADER_OPCODE_SENDs as the message payload. For example, GS input reads, which simply take a URB handle and have the offset in the descriptor. For non-VGRFs, there isn't a payload to split, so just skip past such send messages. Fixes: `589b03d02f` ("intel/fs: Opportunistically split SEND message payloads") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28067>	2024-03-27 04:52:17 +00:00
Caio Oliveira	b2ee98d2db	intel/brw: Handle Xe2 in brw_fs_opt_zero_samples The mlen tracking is in REG_SIZE units, but in Xe2 each GRF has doubled the size. The optimization can only elide full GRFs, so round down the amount of trailing zeros to ensure the optimization will remove only full GRFs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28279>	2024-03-21 22:38:54 +00:00
Kenneth Graunke	a075b44493	intel/brw: Eliminate top-level FIND_LIVE_CHANNEL & BROADCAST once brw_fs_opt_eliminate_find_live_channel eliminates FIND_LIVE_CHANNEL outside of control flow. None of our optimization passes generate additional cases of that instruction, so once it's gone, we shouldn't ever have to run the pass again. Moving it out of the loop should save a bit of CPU time. While we're at it, also clean adjacent BROADCAST instructions that consume the result of our FIND_LIVE_CHANNEL. Without this, we have to perform copy propagation to get the MOV 0 immediate into the BROADCAST, then algebraic to turn it into a MOV, which enables more copy propagation...not to mention CSE gets involved. Since this FIND_LIVE_CHANNEL + BROADCAST pattern from emit_uniformize() is really common, and it's trivial to clean up, we can do that. This lets the initial copy prop in the loop see MOV instead of BROADCAST. Zero impact on fossil-db, but less work in the optimization loop. Together with the previous patches, this cuts compile time in Borderlands 3 on Alchemist by -1.38539% +/- 0.1632% (n = 24). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28286>	2024-03-20 01:04:22 -07:00
Kenneth Graunke	ea423aba1b	intel/brw: Split out 64-bit lowering from algebraic optimizations We don't necessarily want to split up MOVs for 64-bit addresses into 2x 32-bit MOVs right away, as this makes things like copy propagating the whole address around harder. We should do this late, once, while still doing other algebraic optimizations earlier. fossil-db results for Alchemist show tiny improvements: Totals: Instrs: 161310502 -> 161310436 (-0.00%); split: -0.00%, +0.00% Cycles: 14370605606 -> 14370605159 (-0.00%); split: -0.00%, +0.00% Totals from 33 (0.01% of 652298) affected shaders: Instrs: 15053 -> 14987 (-0.44%); split: -0.64%, +0.20% Cycles: 196947 -> 196500 (-0.23%); split: -0.25%, +0.02% Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28286>	2024-03-20 01:04:17 -07:00
Kenneth Graunke	bb191e3af5	intel/brw: Call constant combining after copy propagation/algebraic This copy propagation can create MADs with immediates in src1, which need to be cleaned up by constant combining (which puts them back in VGRFs). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27876>	2024-03-05 11:39:26 +00:00
Caio Oliveira	d9552fccf2	intel/brw: Remove extra stage_prog_data field in fs_visitor Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27861>	2024-02-29 19:28:06 +00:00
Caio Oliveira	559d94cd0d	intel/brw: Use fs_visitor instead of backend_shader in various passes And since we are touching them, rename a couple of passes to follow same name convention as existing ones. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27861>	2024-02-29 19:28:05 +00:00
Caio Oliveira	7ac5696157	intel/brw: Remove Gfx8- code from backend passes Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27691>	2024-02-28 05:45:38 +00:00
Caio Oliveira	f3b7f4726a	intel/brw: Move optimize and small optimizations to brw_fs_opt.cpp Remaining optimizations in brw_fs.cpp will get their own files. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26887>	2024-02-26 20:54:25 +00:00

17 commits