fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-21 02:28:07 +02:00

Author	SHA1	Message	Date
Calder Young	1f1de7ebd6	anv,brw: Allow multiple ray queries without spilling to a shadow stack Allows a shader to have multiple ray queries without spilling them to a shadow stack. Instead, the driver provides the shader with an array of multiple RTDispatchGlobals structs to give each query its own dedicated stack. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38778>	2026-01-16 09:21:50 +00:00
Caio Oliveira	b542ac4ca0	brw: Fix and properly use increment_a64_address() Since the move to MEMORY__LOGICAL the result value was being ignored, so change to use that. Since the conversion to use new registers, some issues were introduced: - Even with `has_64bit_int` ADD with 64-bit immediate value is not supported; - `dst_high` was not being filled if there was no overflow; - Only `dst_low` returned. Found when writing some new code involving large block loads. Fixes: `b79e85a93f` ("brw: always use new registers for load address increments") Fixes: `b55f77161d` ("intel/brw: Switch to emitting MEMORY__LOGICAL opcodes") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39282>	2026-01-15 19:47:23 +00:00
Lionel Landwerlin	fd744b0c8a	brw: switch buffer/image size intrinsics lowering to NIR Fossil-db DG2: Totals from 127 (0.01% of 1799288) affected shaders: Instrs: 60593 -> 60508 (-0.14%); split: -0.15%, +0.01% Cycle count: 7099635 -> 7116148 (+0.23%); split: -0.12%, +0.35% Spill count: 468 -> 466 (-0.43%) Fill count: 224 -> 222 (-0.89%) Max live registers: 6418 -> 6424 (+0.09%); split: -0.06%, +0.16% Non SSA regs after NIR: 11228 -> 11220 (-0.07%); split: -0.20%, +0.12% Fossil-db LNL: Totals from 135 (0.01% of 1573226) affected shaders: Instrs: 55173 -> 55143 (-0.05%); split: -0.07%, +0.01% Cycle count: 9178338 -> 9157052 (-0.23%); split: -0.32%, +0.09% Spill count: 454 -> 452 (-0.44%) Fill count: 181 -> 179 (-1.10%) Max live registers: 12915 -> 12919 (+0.03%); split: -0.06%, +0.09% Non SSA regs after NIR: 10860 -> 10852 (-0.07%); split: -0.20%, +0.13% shader-db LNL: total instructions in shared programs: 16911578 -> 16911566 (<.01%) instructions in affected programs: 1602 -> 1590 (-0.75%) helped: 7 HURT: 0 helped stats (abs) min: 1.0 max: 2.0 x̄: 1.71 x̃: 2 helped stats (rel) min: 0.48% max: 1.10% x̄: 0.75% x̃: 0.74% 95% mean confidence interval for instructions value: -2.17 -1.26 95% mean confidence interval for instructions %-change: -0.96% -0.55% Instructions are helped. total loops in shared programs: 5168 -> 5168 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 848964184 -> 848955094 (<.01%) cycles in affected programs: 1528020 -> 1518930 (-0.59%) helped: 9 HURT: 6 helped stats (abs) min: 2.0 max: 8484.0 x̄: 1212.89 x̃: 20 helped stats (rel) min: 0.02% max: 3.23% x̄: 0.57% x̃: 0.11% HURT stats (abs) min: 2.0 max: 1608.0 x̄: 304.33 x̃: 15 HURT stats (rel) min: <.01% max: 0.59% x̄: 0.19% x̃: 0.07% 95% mean confidence interval for cycles value: -1875.18 663.18 95% mean confidence interval for cycles %-change: -0.75% 0.23% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 3345 -> 3345 (0.00%) spills in affected programs: 0 -> 0 helped: 0 HURT: 0 total fills in shared programs: 1777 -> 1777 (0.00%) fills in affected programs: 0 -> 0 helped: 0 HURT: 0 total sends in shared programs: 869299 -> 869299 (0.00%) sends in affected programs: 0 -> 0 helped: 0 HURT: 0 LOST: 0 GAINED: 0 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39258>	2026-01-14 10:37:32 +00:00
Alyssa Rosenzweig	c339b55f92	brw/nir_lower_fs_load_output: unify texture builders Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39271>	2026-01-14 08:18:15 +00:00
Lionel Landwerlin	0a3f3fd193	brw: drop unused color_outputs_valid key Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39264>	2026-01-12 20:21:48 +00:00
Lionel Landwerlin	c3bd1a1688	brw: handle layer_id only through system value Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39259>	2026-01-12 19:53:36 +00:00
Lionel Landwerlin	081c5bc6a5	brw: fix derivatives on non 32bit floats Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14600 Meh'd-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39226>	2026-01-12 15:18:46 +00:00
Lionel Landwerlin	a97b01801a	brw: enable SIMD32 compute shaders with ray queries Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11020 Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36181>	2026-01-12 12:19:21 +00:00
Lionel Landwerlin	527ae448e5	brw/nir/rt: ensure we can load 2 RT_DISPATCH_GLOBALS Each group of 16 lanes inside a SIMD32 shader will load different globals. In SIMD8/16 shaders, the divergence analysis will turn this load into nir_load_global_constant_uniform_block_intel. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36181>	2026-01-12 12:19:21 +00:00
Lionel Landwerlin	b996b03f21	brw: enable topology opcodes in SIMD32 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36181>	2026-01-12 12:19:21 +00:00
Lionel Landwerlin	286073f6eb	brw: handle lowering of a couple of opcodes Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36181>	2026-01-12 12:19:21 +00:00
Lionel Landwerlin	2fa09500a2	brw: enable ray query spilling in SIMD32 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36181>	2026-01-12 12:19:21 +00:00
Lionel Landwerlin	6d19b898e7	anv/brw: prep work for SIMD32 ray queries Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36181>	2026-01-12 12:19:21 +00:00
Alyssa Rosenzweig	43efc1cc7e	brw: use nir_is_shared_access Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39219>	2026-01-09 20:51:12 +00:00
Caio Oliveira	d160b7726a	brw/scoreboard: Disable nomask workaround for Xe2+ Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details The issue was caused by fused EU feature that is not used in Xe2+ anymore. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36659>	2026-01-09 17:25:00 +00:00
Caio Oliveira	47a6ef3fef	brw/scoreboard: Use a predicate helper for the nomask workaround If it wasn't for the workaround, it wouldn't be necessary to track the whether instructions are exec_all or not. The workaround affects results when mixing a dep and inst with different exec_all. Add the predicate so that, when the workaround is disabled, none of the effects of having different exec_all will kick in, all them will be considered `exec_all = true`. This patch don't change any behavior, just adds the predicate. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36659>	2026-01-09 17:25:00 +00:00
Lionel Landwerlin	faa857a061	intel: rework push constant handling Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details nr_params & params array are gone. brw_ubo_range is not stored on the prog_data structure anymore (Anv already stored a copy of that with its own additional information) The backend now only deals with load_push_data_intel. load_uniform & load_push_constant have to be lowered by the driver. Pre Gfx12.5 platforms have to provide a subgroup_id_param to specify where the subgroup_id value is located in the push constants. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38975>	2026-01-09 14:19:52 +00:00
Lionel Landwerlin	60e359412d	iris: manage TBIMR null push constant wa in driver Anv already manages this itself. This allows removing the logic from the compiler. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38975>	2026-01-09 14:19:52 +00:00
Lionel Landwerlin	f4a0e05970	anv/brw/iris: get rid of param array on prog_data Drivers can do all the lowering to push constants to find the only value useful in that array (subgroup_id). Then drivers call into brw_cs_fill_push_const_info() to get the cross/per thread constant layout computed in the prog_data. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38975>	2026-01-09 14:19:51 +00:00
Lionel Landwerlin	ec456e99f2	brw: add a pass to lower ubo to push constant data Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38975>	2026-01-09 14:19:49 +00:00
Lionel Landwerlin	2c7254c131	brw: invert condition to reduce code nesting Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38975>	2026-01-09 14:19:48 +00:00
Caio Oliveira	dcefa0e6b3	brw: Rework UIP and JIP setting code Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details The current code walks the instructions, and when needed, it will scan to find the next "end of scope" and sometimes the next "end of block". It also has a separate patching logic for HALTs. The new code collects the necessary scope information up front, then walks the instruction backwards, making avoiding the need to scan for the end of scope. It will also walk only the relevant instructions that were previously collected. It also replaces the previous HALT-specific patching logic. With this new change, many cases that were jumping to intermediate HALTs, will now jump straight to the end of scope (or the "end of the program" section). E.g. in ``` if ... (...) HALT ... (...) HALT endif ``` both HALTs now will jump to the end of the scope, instead of the first HALT jumping into the second one. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38914>	2026-01-08 22:01:45 +00:00
Caio Oliveira	c939744d2d	brw: Consolidate generator code for emitting "regular" instructions Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Most of instructions follow the basic formats (1, 2 and 3 src), so consolidate their emission code in generator. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38878>	2026-01-08 16:47:02 +00:00
Caio Oliveira	e1e055f23f	brw: Move LRP related validation Move validation, noting that LRP only supports BRW_TYPE_F -- the previous assert had DF because it also was used by MAD in the past. With that change, ALU3F can be replaced by ALU3 for LRP. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38878>	2026-01-08 16:47:02 +00:00
Caio Oliveira	68e1a07181	brw: Move normalization of 3-src instructions swizzles to a single place When repctrl is used, the swizzle/chansel is ignored. Instead of setting a swizzle that has all zeros and encode that, don't encode anything. For context see `e7598c5a62` ("intel/compiler: Set swizzle to BRW_SWIZZLE_XXXX for scalar region"). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38878>	2026-01-08 16:47:01 +00:00
José Roberto de Souza	0cc73385e6	intel/brw: Document UBO_START Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39175>	2026-01-07 14:25:42 +00:00
José Roberto de Souza	961ca451e0	intel/brw: Add comment to ubo_ranges Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39175>	2026-01-07 14:25:42 +00:00
Georg Lehmann	eb4737a1dd	nir: add nir_alu_instr_is_exact helper Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39103>	2026-01-07 09:40:57 +00:00
Marek Olšák	1912a00a91	ALL: use SHA1_DIGEST_LENGTH etc. instead of hardcoding the numbers only build_id is switched to use literal 20 instead of SHA1_DIGEST_LENGTH because we will increase SHA1_DIGEST_LENGTH to BLAKE3_KEY_LEN Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39110>	2026-01-07 08:32:33 +00:00
José Roberto de Souza	6f031a98e0	intel/brw: Nuke brw_inst::is_volatile() There is no users for that function, is_volatile is only used in brw_opt_cse.cpp is_expression() but it access the information using brw_send_inst struct. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39104>	2026-01-05 14:11:47 +00:00
Georg Lehmann	f3290219ab	nir: use a seperate enum for per alu floating point math control We don't need one bit per bitsize per instruction if only one actually matters in the end. First step towards moving NIR in the direction of full float_controls2 only. Also rename this from fp_fast_math, because that name implied that 0 is the no fast math mode, while the opposite was the case. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39026>	2025-12-29 10:57:05 +00:00
Sushma Venkatesh Reddy	d1d4e3d530	brw: Add EU assembler support for float8 Decode logic in Gfx12+ has become complex with the new types, so Caio suggested that we move to the table like other gens. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39007>	2025-12-19 00:09:53 +00:00
Jordan Justen	0088aae481	intel/brw: Add new encode/decode for use with brw_data_type_float/int Rework: * Sushma: Add BF in brw_data_type_encode, brw_data_type_decode Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39007>	2025-12-19 00:09:53 +00:00
Jordan Justen	46e843f76e	intel/brw: Add brw_data_type_float/brw_data_type_int These type encodings were first were used in dpas instructions, but continue to be used in more places. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39007>	2025-12-19 00:09:52 +00:00
Sushma Venkatesh Reddy	54accefed2	brw: Add BRW_TYPE_BF8 and BRW_TYPE_HF8 for float8 Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39007>	2025-12-19 00:09:52 +00:00
Ian Romanick	b967942b64	brw: Do cmod prop again after scheduling Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details After selecting the scheduling mode, do cmod prop again. It's possible that doing cmod prop between performing a schedule and trying to register allocate would cause a different scheduling mode to be selected. However, this would require fully restoring the pre-schedule set of instructions (via cloning). I have tried to implement this, and it's harder than it looks. :( v2: Delete unused variable `progress`. Noticed by Marge. shader-db: All Intel platforms had similar results. (Meteor Lake shown) total instructions in shared programs: 19967018 -> 19967006 (<.01%) instructions in affected programs: 10652 -> 10640 (-0.11%) helped: 4 / HURT: 0 total cycles in shared programs: 884129990 -> 884139590 (<.01%) cycles in affected programs: 20334512 -> 20344112 (0.05%) helped: 0 / HURT: 4 fossil-db: Lunar Lake Totals: Instrs: 924967191 -> 924963460 (-0.00%); split: -0.00%, +0.00% Cycle count: 105962414958 -> 105961925594 (-0.00%); split: -0.00%, +0.00% Spill count: 3423582 -> 3423564 (-0.00%); split: -0.00%, +0.00% Fill count: 4877121 -> 4876955 (-0.00%); split: -0.00%, +0.00% Totals from 2511 (0.12% of 2018786) affected shaders: Instrs: 12541707 -> 12537976 (-0.03%); split: -0.03%, +0.00% Cycle count: 4816359238 -> 4815869874 (-0.01%); split: -0.01%, +0.00% Spill count: 179536 -> 179518 (-0.01%); split: -0.03%, +0.02% Fill count: 279407 -> 279241 (-0.06%); split: -0.07%, +0.01% Meteor Lake, DG2, Tiger Lake, Ice Lake, and Skylake had similar results. (Meteor Lake shown) Totals: Instrs: 980252404 -> 980237686 (-0.00%); split: -0.00%, +0.00% Cycle count: 91758669556 -> 91764028404 (+0.01%); split: -0.00%, +0.01% Spill count: 3664771 -> 3664744 (-0.00%); split: -0.00%, +0.00% Fill count: 4962078 -> 4960482 (-0.03%); split: -0.04%, +0.01% Totals from 8472 (0.38% of 2251522) affected shaders: Instrs: 34977623 -> 34962905 (-0.04%); split: -0.04%, +0.00% Cycle count: 6251857553 -> 6257216401 (+0.09%); split: -0.04%, +0.13% Spill count: 480251 -> 480224 (-0.01%); split: -0.01%, +0.00% Fill count: 676539 -> 674943 (-0.24%); split: -0.28%, +0.05% Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38315>	2025-12-18 15:15:20 -08:00
Ian Romanick	09450faf6a	brw: Do cmod prop again after post-RA scheduling shader-db: All Intel platforms had similar results. (Meteor Lake shown) total instructions in shared programs: 19968728 -> 19963825 (-0.02%) instructions in affected programs: 788014 -> 783111 (-0.62%) helped: 2503 / HURT: 0 total cycles in shared programs: 884112912 -> 884093268 (<.01%) cycles in affected programs: 20017168 -> 19997524 (-0.10%) helped: 1830 / HURT: 52 LOST: 0 GAINED: 6 fossil-db: All Intel platforms had similar results. (Meteor Lake shown) Totals: Instrs: 980768016 -> 980172179 (-0.06%) Cycle count: 91762351767 -> 91757280093 (-0.01%); split: -0.01%, +0.00% Max dispatch width: 37602592 -> 37608768 (+0.02%) Totals from 157150 (6.98% of 2251329) affected shaders: Instrs: 107323207 -> 106727370 (-0.56%) Cycle count: 12696754006 -> 12691682332 (-0.04%); split: -0.04%, +0.00% Max dispatch width: 3708584 -> 3714760 (+0.17%) Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38315>	2025-12-18 15:15:20 -08:00
Ian Romanick	08d71730ca	brw/cmod: Propagate to an instruction with same source Detect cases like mov.nz.f0.0(8) null<1>D g66<8,8,1>D (+f0.0) sel(8) g123<1>UD g87<8,8,1>UD g84<8,8,1>UD mov.nz.f0.0(8) null<1>D g66<8,8,1>D (+f0.0) sel(8) g124<1>UD g88<8,8,1>UD g85<8,8,1>UD Either MOV instruction could also be an equivalent CMP. v2: Require no predicate, groups match, and flags written match. v3: Add some more unit tests. Suggested by Caio. shader-db: All Intel platforms had similar results. (Lunar Lake shown) total instructions in shared programs: 17203627 -> 17203590 (<.01%) instructions in affected programs: 51432 -> 51395 (-0.07%) helped: 37 / HURT: 0 total cycles in shared programs: 879884982 -> 879884670 (<.01%) cycles in affected programs: 6014730 -> 6014418 (<.01%) helped: 25 / HURT: 4 fossil-db: Lunar Lake Totals: Instrs: 925092938 -> 925071952 (-0.00%); split: -0.00%, +0.00% Cycle count: 105972157149 -> 105966120894 (-0.01%); split: -0.01%, +0.00% Spill count: 3423592 -> 3423582 (-0.00%) Fill count: 4876743 -> 4877121 (+0.01%); split: -0.00%, +0.01% Max live registers: 193525293 -> 193525251 (-0.00%) Max dispatch width: 49047056 -> 49047088 (+0.00%); split: +0.00%, -0.00% Totals from 17714 (0.88% of 2018791) affected shaders: Instrs: 56708169 -> 56687183 (-0.04%); split: -0.04%, +0.00% Cycle count: 4560530879 -> 4554494624 (-0.13%); split: -0.15%, +0.01% Spill count: 434846 -> 434836 (-0.00%) Fill count: 807443 -> 807821 (+0.05%); split: -0.02%, +0.07% Max live registers: 4332542 -> 4332500 (-0.00%) Max dispatch width: 295248 -> 295280 (+0.01%); split: +0.02%, -0.01% Meteor Lake and DG2 had similar results. (Meteor Lake shown) Totals: Instrs: 995075628 -> 995051291 (-0.00%); split: -0.00%, +0.00% Cycle count: 92060967154 -> 92059311640 (-0.00%); split: -0.00%, +0.00% Spill count: 3664664 -> 3664675 (+0.00%); split: -0.00%, +0.00% Fill count: 4961929 -> 4961874 (-0.00%); split: -0.00%, +0.00% Max live registers: 121480292 -> 121480184 (-0.00%) Max dispatch width: 37947528 -> 37947496 (-0.00%) Totals from 20569 (0.90% of 2278279) affected shaders: Instrs: 57437989 -> 57413652 (-0.04%); split: -0.04%, +0.00% Cycle count: 4297505238 -> 4295849724 (-0.04%); split: -0.06%, +0.03% Spill count: 487508 -> 487519 (+0.00%); split: -0.00%, +0.00% Fill count: 869228 -> 869173 (-0.01%); split: -0.01%, +0.00% Max live registers: 2413028 -> 2412920 (-0.00%) Max dispatch width: 239280 -> 239248 (-0.01%) Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) Totals: Instrs: 1012570598 -> 1012546137 (-0.00%); split: -0.00%, +0.00% Cycle count: 85579989052 -> 85589116671 (+0.01%); split: -0.00%, +0.01% Spill count: 3901755 -> 3901748 (-0.00%) Fill count: 6799383 -> 6799367 (-0.00%) Max live registers: 122288761 -> 122288658 (-0.00%) Totals from 20595 (0.90% of 2280449) affected shaders: Instrs: 57764192 -> 57739731 (-0.04%); split: -0.04%, +0.00% Cycle count: 3899898675 -> 3909026294 (+0.23%); split: -0.04%, +0.27% Spill count: 481262 -> 481255 (-0.00%) Fill count: 1057996 -> 1057980 (-0.00%) Max live registers: 2412395 -> 2412292 (-0.00%) Skylake Totals: Instrs: 516619178 -> 516617390 (-0.00%) Cycle count: 57593545602 -> 57592502019 (-0.00%); split: -0.00%, +0.00% Fill count: 860403 -> 860402 (-0.00%) Max live registers: 87553761 -> 87553649 (-0.00%) Totals from 1357 (0.08% of 1730068) affected shaders: Instrs: 3575640 -> 3573852 (-0.05%) Cycle count: 1772148559 -> 1771104976 (-0.06%); split: -0.06%, +0.00% Fill count: 68917 -> 68916 (-0.00%) Max live registers: 131237 -> 131125 (-0.09%) Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38315>	2025-12-18 15:15:20 -08:00
Ian Romanick	50f2cd7366	brw/dce: Don't generate more NULL destinations after brw_lower_3src_null_dest Later commits will call DCE after lowering has been performed. Creating more things that would need lowering is problematic. No shader-db or fossil-db changes on any Intel platform. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38315>	2025-12-18 15:15:20 -08:00
Ian Romanick	24cd8aa3b8	brw/cmod: Allow FIXED_GRF Later commits will call cmod prop after register allocation. At that time, there is only FIXED_GRF. No shader-db or fossil-db changes on any Intel platform. v2: FIXED_GRF uses subnr instead of offset. Add a unit test to demonstrate the issue. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> [v1] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38315>	2025-12-18 15:15:20 -08:00
Ian Romanick	d7227b11a1	brw: elk: Disable can_do_cmod for MACH PRMs for G35 (Gfx4) through Ivy Bridge (Gfx7) all say that conditional modifiers are allowed for MACH. Starting with Haswell (Gfx7.5), this seems to be removed. This function doesn't have any way to know the platform, so false is returned for all platforms. No shader-db or fossil-db changes on any Intel platform. Prevents a failure in "brw: Do cmod prop again after post-RA scheduling" in piglit's builtin-uint-mad_sat-1.0.generated.cl. Cc: stable Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38315>	2025-12-18 15:15:20 -08:00
Ian Romanick	ba30794847	brw/cmod: Don't propagate between instructions in different groups The group implicity selects which flags the instruction can write. This was discovered while working on another set of changes that could change some logical operations into predicated MOV instructions. Prevents regressions later in the series in dEQP-VK.graphicsfuzz.cov-loop-fragcoord-identical-condition. No shader-db or fossil-db changes on any Intel platform. v2: Update the comment in the test case. Suggested by Caio. Fixes: `95ac3b1dae` ("i965/fs: don't propagate cmod when the exec sizes differ") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38315>	2025-12-18 15:15:20 -08:00
Ian Romanick	c0fb93506b	brw: Add brw_reg::is_grf v2: Add a function comment. Suggested by Caio. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38315>	2025-12-18 15:15:20 -08:00
Alyssa Rosenzweig	61dc9201a1	brw: constant fold before texture lowering This ensures we don't need dynamic stuff. Noticed when debugging weird regressions around the mcs lowering. ARL: total instructions in shared programs: 19857061 -> 19854964 (-0.01%) instructions in affected programs: 91768 -> 89671 (-2.29%) helped: 154 HURT: 0 helped stats (abs) min: 9.0 max: 33.0 x̄: 13.62 x̃: 13 helped stats (rel) min: 0.51% max: 40.91% x̄: 4.66% x̃: 3.36% 95% mean confidence interval for instructions value: -14.04 -13.19 95% mean confidence interval for instructions %-change: -5.49% -3.84% Instructions are helped. total cycles in shared programs: 884538769 -> 884485530 (<.01%) cycles in affected programs: 10508994 -> 10455755 (-0.51%) helped: 116 HURT: 38 helped stats (abs) min: 4.0 max: 15238.0 x̄: 666.22 x̃: 148 helped stats (rel) min: 0.01% max: 34.53% x̄: 2.58% x̃: 1.07% HURT stats (abs) min: 4.0 max: 4027.0 x̄: 632.68 x̃: 302 HURT stats (rel) min: 0.01% max: 32.75% x̄: 3.46% x̃: 0.59% 95% mean confidence interval for cycles value: -631.32 -60.09 95% mean confidence interval for cycles %-change: -2.06% -0.12% Cycles are helped. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39023>	2025-12-18 17:55:29 +00:00
Kenneth Graunke	d83c699045	brw: Convert GS pulled inputs to use URB intrinsics We leave GS pushed inputs using load_per_vertex_input for now - they're relatively simple, and using load_attribute_payload doesn't work well since it's assumed to be convergent (for TES, FS inputs) while GS inputs are divergent. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38990>	2025-12-18 06:39:02 +00:00
Kenneth Graunke	eae3bd19d4	brw: Move GS URB Read Length limiting to brw_nir_lower_gs_inputs() We're going to be deciding on push vs. pull in the NIR lowering pass soon, so move the code to limit our register usage from brw's thread payload code to brw_nir_lower_gs_inputs(). Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38990>	2025-12-18 06:39:02 +00:00
Kenneth Graunke	8889802271	brw: Make max_push_bytes a parameter to URB lowering data This allows us to program something other than a stage-based constant. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38990>	2025-12-18 06:39:02 +00:00
Kenneth Graunke	f62f7d80e2	brw: Update try_load_push_input to handle dword-unit offsets too We don't need this case today, but it's trivial to handle. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38990>	2025-12-18 06:39:01 +00:00
Caio Oliveira	9c16bbd023	brw: Perform mark_last_urb_write_with_eot optimization after CFG Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Avoid using exec_node::remove() and the initial "main list of instructions", and instead use the existing helpers like other passes. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37146>	2025-12-16 17:02:58 +00:00
Caio Oliveira	e53576a559	brw: Move MATH related validation Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Moved existing checks to EU validation and added a few more based on instruction description in the various PRMs / BSpec. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38877>	2025-12-16 01:34:46 +00:00

1 2 3 4 5 ...

4857 commits