fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-22 13:30:12 +01:00

Author	SHA1	Message	Date
Caio Oliveira	9d53e27579	intel/brw: Remove brw_shader::import_uniforms() The brw_shader::uniforms now is derived from the nir_shader. The only exception is compute shaders for older Gfx versions, so we move the adjust logic for that. The benefit here is untangling the code for compilation variants, that before needed to keep track of the first that compiled to, in most cases, copy an integer. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33541>	2025-08-28 00:06:19 +00:00
Caio Oliveira	b8a35a8a27	brw: Pass per_primitive_offset in brw_shader_params Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33541>	2025-08-28 00:06:19 +00:00
Caio Oliveira	6ca9021758	brw: Add brw_shader_params And unify the initialization code for brw_shader. Avoid passing brw_compile_params since for a single compilation we might have multiple shaders (the case for BS stage). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33541>	2025-08-28 00:06:18 +00:00
Lionel Landwerlin	2281e88381	brw: make assign_curb_setup visible in optimizer debug Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36455>	2025-08-21 09:04:54 +00:00
Lionel Landwerlin	fe38fb858c	brw: workaround broken indirect RT messages on Gfx11 Unfortunately we cannot use the indirect descriptor on Gfx11, it appears to just drop writes. Other platforms appear to be fine. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36883>	2025-08-20 15:01:50 +00:00
Kenneth Graunke	47fe9d28e7	brw: Enumerate SHADER_OPCODE_SEND sources and standardize how many This introduces enums for SHADER_OPCODE_SEND[_GATHER] sources, similar similar to what we've done for most of the newer logical opcodes. This allows us to use actual names for sources rather than remembering their order, or leaving ourselves comments like /* ex_desc */ all over. It will also make it easier to add or reorder sources in the future. While we're at it, we also standardize on the number of sources. Previously, we allowed SHADER_OPCODE_SEND to have either 3 (monosend) or 4 (split send) sources, but this is mostly for haphazard historical reasons. We now specify all sources every time, eliminating the need for careful inst->source checks before accessing the last source. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34040>	2025-08-08 22:12:08 +00:00
Lionel Landwerlin	6d863fda2d	anv/brw: move sample_shading_enable to wm_prog_data The vulkan runtime doesn´t store this parameter in the dynamic state (since it's not a dynamic state). Just capture it at compile time and leave on the wm_prog_data. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36665>	2025-08-08 14:06:58 +00:00
Lionel Landwerlin	f2696b441d	anv/brw: store min_sample_shading on wm_prog_data Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36665>	2025-08-08 14:06:57 +00:00
Alyssa Rosenzweig	3719983edf	brw: replace lower_fs_msaa with nir_inline_sysval Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Marek Olšák <maraeo@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36516>	2025-08-03 21:27:47 +00:00
Lionel Landwerlin	60932e8fae	brw: always ensure coarse pixel is disabled on Gfx9 No HW support there. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36457>	2025-07-30 07:57:19 +00:00
Lionel Landwerlin	9371e8d370	brw: fixup coarse_z computation The delivered values in the coarse pixel size are 0 when coarse pixel dispatch is disabled and that is screwing up our half pixel offset adjustment. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36457>	2025-07-30 07:57:19 +00:00
Lionel Landwerlin	9dac7dda87	brw: fixup source depth enabling with coarse pixel shading Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36457>	2025-07-30 07:57:18 +00:00
Lionel Landwerlin	fcf4401824	brw: handle wa_18019110168 with independent shader compilation Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>	2025-06-28 05:55:35 +00:00
Lionel Landwerlin	e1a7eb1718	brw: extract out attribute register remapping Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>	2025-06-28 05:55:33 +00:00
Lionel Landwerlin	5cc66e2c8d	anv/brw: move Wa_18019110168 handling to backend We simplify the implementation by assuming the worse case, copying entire per-vertex regions if necessary. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>	2025-06-28 05:55:32 +00:00
Lionel Landwerlin	f0f4f9c566	brw: fix vertex attribute offset computation The formula uses scalar indices (4bytes), not slots (16bytes). We also incorrectly passed a scalar (vertex case) & slot (mesh case) offset in the push constants. Use slots instead so that the value is smaller and we can pack more stuff into fs_msaa_flags. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `18bbcf9a63` ("intel: introduce new VUE layout for separate compiled shader with mesh") Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>	2025-06-28 05:55:31 +00:00
Emma Anholt	88f1656133	intel/elk: Save the UW pixel x/y as a temp. This will be used for representing gl_FragCoord in NIR and reducing payload registers pushed. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>	2025-06-18 23:11:38 +00:00
Emma Anholt	af74abd68c	intel/fs: Don't bother checking if load_frag_coord uses interpolation. This was leftover dead code from `4bb6e6817e` ("intel: Use a system value for gl_FragCoord") -- the sysval doesn't do any interpolation and doesn't have sources that could use a barycentric. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>	2025-06-18 23:11:37 +00:00
Caleb Callaway	e7454f5318	intel/debug: shader dump filter Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details v2: Fixes filtering for various brw shader dump logic Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35061>	2025-05-23 19:57:02 +00:00
Iván Briano	8ee14e5291	brw/anv: add provoking vertex to fs_msaa_flags This will be necessary to select the right value for flat inputs in fragment shaders when fragment shader barycentrics are in use. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34445>	2025-05-20 20:57:58 +00:00
Iván Briano	acdd30a9da	brw: check if the FS needs vertex_attributes_bypass to be set Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34445>	2025-05-20 20:57:58 +00:00
Lionel Landwerlin	5c7c1eceb5	anv/brw: handle pipeline libraries with mesh I always thought there was a massive issue with pipeline libraries & mesh shaders. Indeed recent CTS tests have exposed a number of issues. Some values delivered to the fragment shader are coming from different places depending on whether the preceding shader is Mesh or not. For example PrimitiveID is delivered in the per-primitive block in Mesh pipelines whereas for other pipelines it's coming as a VUE slot (which is per-vertex). Those are 2 different locations in the payload. We have to find a layout for fragment shaders that is compatible with everything. Leaving gaps here and there in the thread payload. Fixes the following test pattern : dEQP-VK.mesh_shader.ext.smoke.fast_lib.shared_* Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Acked-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>	2025-05-08 06:48:35 +00:00
Lionel Landwerlin	18bbcf9a63	intel: introduce new VUE layout for separate compiled shader with mesh Mesh shaders have per vertex block in URB pretty much identical to the VUE format. Let's just reuse that concept to do all of our layout in the payload attribute registers. This will ensure that we have consistent VUE layout between Mesh & non-Mesh pipelines. We need a new way of laying out the VUE though as we have to accomodate a HW constraint of maximum (per-primitive + per-vertex) of 32 varying. This means we cannot have 2 locations in the payload for things like PrimitiveID which can come from either the per-primitive or the per-vertex block. The new layout places the PrimitiveID at the end of the per-vertex attributes and shrinks the delivery dynamically if the mesh stage is active. The shader is compiled with a MOV_INDIRECT to read the PrimitiveID from the right location in the attributes. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>	2025-05-08 06:48:35 +00:00
Lionel Landwerlin	2d396f6085	intel: prepare VUE layout for more than 2 layouts Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>	2025-05-08 06:48:35 +00:00
Lionel Landwerlin	95efdca00b	brw: add documentation pointers to FS attribute layout Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>	2025-05-08 06:48:35 +00:00
Lionel Landwerlin	9d342081e7	brw/nir: add intrinsics to read attribute payload register indirectly Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>	2025-05-08 06:48:35 +00:00
Lionel Landwerlin	62d2e323ba	anv/brw: shrink FS varying payload We're currently allocating payload spots for 3 fields already delivered somewhere else in the payload. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>	2025-05-08 06:48:34 +00:00
Lionel Landwerlin	c467444670	brw/nir: use a new intrinsic for fs_msaa_flag Avoid NIR code doing offset computations. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>	2025-05-08 06:48:34 +00:00
Lionel Landwerlin	cbbe7ff66e	brw: add new helper to print out FS URB setup Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>	2025-05-08 06:48:34 +00:00
Lionel Landwerlin	06ad9a25e5	brw: fix Wa_22013689345 emission Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details 2 problems : - not detecting null destination correctly - applied too late using SHADER_OPCODE_MEMORY_FENCE, when lowering already happened Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34319>	2025-04-10 16:44:28 +00:00
Caio Oliveira	7ae638c0fe	brw: Add brw_builder::uniform() Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34355>	2025-04-04 23:07:21 +00:00
Connor Abbott	7a55e13939	nir, compiler: Rename needs_quad_helper_invocations This currently treats coarse and fine derivatives the same, but Qualcomm needs to know whether just coarse derivatives are used or fine derivatives/quad ops are also used. Rename this to needs_coarse_quad_helper_invocations make clear the difference from the new field, needs_full_quad_helper_invocations. Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com> Fixes: `264d8a6766` ("ir3: Set need_full_quad depending on info.fs.require_full_quads") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33862>	2025-03-14 21:55:57 +00:00
Kenneth Graunke	cdbedc9eff	intel: Move unlit centroid workaround into the elk compiler Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This was only needed on Sandybridge. We can delete the brw code, and replace the generic devinfo bit with a helper inside the elk compiler itself. Thanks to Iván Briano for noticing we still had dead brw code for this. Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33764>	2025-03-10 17:23:07 -07:00
Caio Oliveira	8e2a7cb42d	brw: Embed at_end() inside brw_builder(brw_shader *) constructor All remaining uses of that constructor would also use at_end(), and vice-versa. So just implement that behavior in the constructor itself. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33815>	2025-03-06 23:33:38 +00:00
Kenneth Graunke	88309a9818	brw: Rename shared function enums for clarity Our name for this enum was brw_message_target, but it's better known as shared function ID or SFID. Call it brw_sfid to make it easier to find. Now that brw only supports Gfx9+, we don't particularly care whether SFIDs were introduced on Gfx4, Gfx6, or Gfx7.5. Also, the LSC SFIDs were confusingly tagged "GFX12" but aren't available on Gfx12.0; they were introduced with Alchemist/Meteorlake. GFX6_SFID_DATAPORT_SAMPLER_CACHE in particular was confusing. It sounds like the SFID to use for the sampler on Gfx6+, however it has nothing to do with the sampler at all. BRW_SFID_SAMPLER remains the sampler SFID. On Haswell, we ran out of messages on the main data cache data port, and so they introduced two additional ones, for more messages. The modern Tigerlake PRMs simply call these DP_DC0, DP_DC1, and DP_DC2. I think the "sampler" name came from some idea about reorganizing messages that never materialized (instead, the LSC came as a much larger cleanup). Recently we've adopted the term "HDC" for the legacy data cluster, as opposed to "LSC" for the modern Load/Store Cache. To make clear which SFIDs target the legacy HDC dataports, we use BRW_SFID_HDC0/1/2. We were also citing the G45, Sandybridge, and Ivybridge PRMs for a compiler that supports none of those platforms. Cite modern docs. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33650>	2025-02-27 08:49:24 +00:00
Lionel Landwerlin	2f156ddb50	brw: factor out base prog_data setting Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Michael Cheng <michael.cheng@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33643>	2025-02-22 08:30:22 +00:00
Caio Oliveira	cf3bb77224	intel/brw: Rename fs_visitor to brw_shader Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32536>	2025-02-11 09:13:28 +00:00
Caio Oliveira	352a63122f	intel/brw: Rename files brw_fs.cpp/h to brw_shader.cpp/h Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32536>	2025-02-11 09:13:28 +00:00
Caio Oliveira	f8a979466b	intel/brw: Rename and move thread_payload types to own header Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32536>	2025-02-11 09:13:28 +00:00
Kenneth Graunke	accef5e8f5	brw: Replace fs_inst::target field with logical FB read/write sources We can just specify this as a source to the logical FB read/write opcodes. Notably FB reads had no sources before; now they have one. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33297>	2025-02-08 01:07:22 +00:00
Kenneth Graunke	32dd722ff3	brw: Replace fs_inst::last_rt with a logical control source Rather than using a bit in the generic fs_inst data structure, we can simply set a source on our logical FB write messages. (We already do so for many other cases.) In the repclear shader, setting this wasn't actually having an effect, as we were setting it on a SHADER_OPCODE_SEND message which ignored it. (We had already correctly set the bit in the message descriptor.) Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33297>	2025-02-08 01:07:22 +00:00
Kenneth Graunke	fce01b8461	brw: Drop FB_WRITE_LOGICAL_SRC_DST_DEPTH source This was used for legacy depth passthrough on older hardware. Gfx9+ doesn't actually have dst depth as part of the message, which is the only hardware brw supports these days. It sure looks like we were setting it though... Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33297>	2025-02-08 01:07:22 +00:00
Caio Oliveira	ea87bab4ce	intel/brw: Remove 'using namespace brw' directives Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33418>	2025-02-06 07:58:55 -08:00
Caio Oliveira	1ade9a05d8	intel/brw: Use brw prefix instead of namespace for analysis implementations Also drop the 'fs' prefix when applicable. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33048>	2025-02-05 21:47:07 +00:00
Caio Oliveira	e0614e8ea1	intel/brw: Use brw_analysis prefix for liveness analysis files Move declaration to the common header and rename definition file. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33048>	2025-02-05 21:47:06 +00:00
Caio Oliveira	92085e7bab	intel/brw: Remove 'fs' prefix from brw_from_nir functions Reviewed-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Dylan Baker <None> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33330>	2025-02-03 23:08:11 +00:00
Caio Oliveira	d59bd421a2	intel/brw: Rename fs_inst to brw_inst Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33114>	2025-01-31 00:57:21 +00:00
Francisco Jerez	70fecb1483	intel/brw: Report number of GRF registers used in brw_stage_prog_data. This is similar to what we used to do on pre-SNB platforms, the number of GRF registers used by the shader will be used on Xe3+ to adjust the trade-off between thread-level parallelism and size of the GRF file. Plumb the value through prog_data so the driver can set up the hardware state accordingly. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32664>	2025-01-29 23:39:32 +00:00
Sagar Ghuge	7e1362e9c0	intel/brw/xe3+: Don't compile SIMD32 if there is ray queries Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32664>	2025-01-29 23:39:32 +00:00
Francisco Jerez	5b6906076e	intel/brw/xe3+: brw_compile_fs() implementation for Xe3+. This reworks the implementation of brw_compile_fs() to reduce compile time and take advantage of wider dispatch modes more aggressively than the original logic. The new "optimistic" PS compilation logic starts with the SIMD width that is potentially highest performance and only compiles additional narrower variants if that fails (typically due to spilling or hardware restrictions), while the old "pessimistic" logic did the opposite: It started with the narrowest SIMD width and compiled additional variants with increasing register pressure until one of them failed to compile. The main disadvantage of this is that selectively throwing away some of the compiled variants based on the static analysis of their performance behavior will no longer be possible, however this is expected to be less useful on Xe3+ since the GRF space allocated to a thread can be scaled up or down, which leads to less dramatic differences in scheduling between SIMD variants. In typical non-spilling cases where we formerly compiled SIMD16 and SIMD32 variants of the same fragment shader, this change will halve the number of backend compilations required to build a shader. With multi-polygon PS dispatch enabled (which is disabled by default right now) this has an even more dramatic effect since the number of compiler iterations can be reduced down to a fifth in the best case scenario. Even though in most cases we will only attempt to return a single binary from the pixel shader compilation, the hardware allows a pair of PS kernels to be specified, and we'll still take advantage of this when the multi-polygon PS kernel has the potential to have worse performance than the single-polygon shader because only the latter register-allocates successfully at SIMD32 -- Only in such case (SIMD2x8 multi-polygon, SIMD32 single-polygon) we'll continue programming both so the hardware will chose one or the other at runtime depending on the SIMD fullness and number of polygons it can buffer at runtime. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32664>	2025-01-29 23:39:32 +00:00

1 2

75 commits