fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-23 19:28:11 +02:00

Author	SHA1	Message	Date
Francisco Jerez	6eea9659db	intel/brw/xe3+: Model trade-off between parallelism and GRF use in performance analysis. This extends the performance analysis pass used in previous generations to make it more useful to deal with the performance trade-off encountered on xe3 hardware as a result of VRT. VRT allows the driver to request a per-thread GRF allocation different from the 128 GRFs that were typical in previous platforms, but this comes at either a thread parallelism cost or benefit depending on the number of GRF register blocks requested. This makes a number of decisions more difficult for the compiler since certain optimizations potentially trade off run-time in a thread against the total number of threads that can run in parallel (e.g. consider scheduling and how reordering an instruction to avoid a stall can increase GRF use and therefore reduce thread-level parallelism when trying to improve instruction-level parallelism). This patch provides a simple heuristic tool to account for the combined interaction of register pressure and other single-threaded factors that affect performance. This is expressed with the redefinition of the pre-existing brw_performance::throughput estimate as the number of invocations per cycle per EU that would be achieved if there were enough threads to reach full load (in this sense this is to be considered a heuristic since the penalty from VRT may be lower than expected from this model at low EU load). This will be used e.g. in order to decide whether to use a more aggressive latency-minimizing mode during scheduling or a mode more effective at minimizing register pressure (it makes sense to take the path that will lead to the most invocations being serviced per cycle while under load). This also allows us to re-enable the old PS SIMD32 heuristic on xe3+, and due to this change it is able to identify cases where the combined effect of poorer scheduling and higher GRF use of the SIMD32 variant makes it more favorable to use SIMD16 only (see last patch of the MR for details and numbers). Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36618>	2025-09-10 02:15:56 +00:00
Kenneth Graunke	b848fa4595	brw: Rename is_send_from_grf to is_send, replace other is_send() helper The is_send() helper is just a wrapper around inst->is_send_from_grf() now, so we can combine the two. Trim the name from is_send_from_grf() to is_send(), as it's shorter, and also matches is_math(). Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34040>	2025-08-08 22:12:05 +00:00
Ian Romanick	fa74c31b22	brw: Allow additional flags registers on Xe2+ Xe2 adds two more flags registers. We barely use the second flags register on previous platforms, so the omission was not previously noticed. There are several efforts in progress that will add using of more flags registers. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35415>	2025-07-24 23:08:08 +00:00
Sagar Ghuge	bea9d79cb9	intel/compiler: Add support for MSAA typed load/store messages Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32690>	2025-03-07 23:06:14 +00:00
Kenneth Graunke	88309a9818	brw: Rename shared function enums for clarity Our name for this enum was brw_message_target, but it's better known as shared function ID or SFID. Call it brw_sfid to make it easier to find. Now that brw only supports Gfx9+, we don't particularly care whether SFIDs were introduced on Gfx4, Gfx6, or Gfx7.5. Also, the LSC SFIDs were confusingly tagged "GFX12" but aren't available on Gfx12.0; they were introduced with Alchemist/Meteorlake. GFX6_SFID_DATAPORT_SAMPLER_CACHE in particular was confusing. It sounds like the SFID to use for the sampler on Gfx6+, however it has nothing to do with the sampler at all. BRW_SFID_SAMPLER remains the sampler SFID. On Haswell, we ran out of messages on the main data cache data port, and so they introduced two additional ones, for more messages. The modern Tigerlake PRMs simply call these DP_DC0, DP_DC1, and DP_DC2. I think the "sampler" name came from some idea about reorganizing messages that never materialized (instead, the LSC came as a much larger cleanup). Recently we've adopted the term "HDC" for the legacy data cluster, as opposed to "LSC" for the modern Load/Store Cache. To make clear which SFIDs target the legacy HDC dataports, we use BRW_SFID_HDC0/1/2. We were also citing the G45, Sandybridge, and Ivybridge PRMs for a compiler that supports none of those platforms. Cite modern docs. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33650>	2025-02-27 08:49:24 +00:00
Caio Oliveira	d2c39b1779	intel/brw: Always have a (non-DO) block after a DO in the CFG Make the "block after DO" more stable so that adding instructions after a DO doesn't require repairing the CFG. Use a new SHADER_OPCODE_FLOW instruction that is a placeholder representing "go to the next block" and disappears at code generation. For some context, there are a few facts about how CFG currently works - Blocks are assumed to not be empty; - DO is always by itself in a block, i.e. starts and ends a block; - There are no empty blocks; - Predicated WHILE and CONTINUE will link to the "block after DO"; - When nesting loops, it is possible that the "block after DO" is another "DO". Reasons and further explanations for those are in the brw_cfg.c comments. What makes this new change useful is that a pass might want to add instructions between two DO instructions. When that happens, a new block must be created and any predicated WHILE and CONTINUE must be repaired. So, instead of requiring a repair (which has proven to be tricky in the past), this change adds a block that can be "virtually" empty but allow instructions to be added without further changes. One alternative design would be allowing empty blocks, that would be a deeper change since the blocks are currently assumed to be not empty in various places. We'll save that for when other changes are made to the CFG. The problem described happens in brw_opt_combine_constants, and a different patch will clean that up. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33536>	2025-02-24 23:25:06 +00:00
Caio Oliveira	cf3bb77224	intel/brw: Rename fs_visitor to brw_shader Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32536>	2025-02-11 09:13:28 +00:00
Caio Oliveira	352a63122f	intel/brw: Rename files brw_fs.cpp/h to brw_shader.cpp/h Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32536>	2025-02-11 09:13:28 +00:00
Kenneth Graunke	ae60338142	brw: Lower MEMORY_FENCE and INTERLOCK in lower_logical_sends We teach lower_logical_sends to lower these to SHADER_OPCODE_SEND and drop all the corresponding generator and eu_emit code. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33297>	2025-02-08 01:07:22 +00:00
Caio Oliveira	1ade9a05d8	intel/brw: Use brw prefix instead of namespace for analysis implementations Also drop the 'fs' prefix when applicable. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33048>	2025-02-05 21:47:07 +00:00
Caio Oliveira	0ebb75743d	intel/brw: Use brw_analysis prefix for performance analysis files Move declaration to the common header and rename definition file. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33048>	2025-02-05 21:47:06 +00:00

11 commits