fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-20 11:18:11 +02:00

Author	SHA1	Message	Date
Lionel Landwerlin	ec456e99f2	brw: add a pass to lower ubo to push constant data Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38975>	2026-01-09 14:19:49 +00:00
Lionel Landwerlin	2c7254c131	brw: invert condition to reduce code nesting Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38975>	2026-01-09 14:19:48 +00:00
Caio Oliveira	dcefa0e6b3	brw: Rework UIP and JIP setting code Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details The current code walks the instructions, and when needed, it will scan to find the next "end of scope" and sometimes the next "end of block". It also has a separate patching logic for HALTs. The new code collects the necessary scope information up front, then walks the instruction backwards, making avoiding the need to scan for the end of scope. It will also walk only the relevant instructions that were previously collected. It also replaces the previous HALT-specific patching logic. With this new change, many cases that were jumping to intermediate HALTs, will now jump straight to the end of scope (or the "end of the program" section). E.g. in ``` if ... (...) HALT ... (...) HALT endif ``` both HALTs now will jump to the end of the scope, instead of the first HALT jumping into the second one. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38914>	2026-01-08 22:01:45 +00:00
Caio Oliveira	c939744d2d	brw: Consolidate generator code for emitting "regular" instructions Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Most of instructions follow the basic formats (1, 2 and 3 src), so consolidate their emission code in generator. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38878>	2026-01-08 16:47:02 +00:00
Caio Oliveira	e1e055f23f	brw: Move LRP related validation Move validation, noting that LRP only supports BRW_TYPE_F -- the previous assert had DF because it also was used by MAD in the past. With that change, ALU3F can be replaced by ALU3 for LRP. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38878>	2026-01-08 16:47:02 +00:00
Caio Oliveira	68e1a07181	brw: Move normalization of 3-src instructions swizzles to a single place When repctrl is used, the swizzle/chansel is ignored. Instead of setting a swizzle that has all zeros and encode that, don't encode anything. For context see `e7598c5a62` ("intel/compiler: Set swizzle to BRW_SWIZZLE_XXXX for scalar region"). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38878>	2026-01-08 16:47:01 +00:00
José Roberto de Souza	0cc73385e6	intel/brw: Document UBO_START Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39175>	2026-01-07 14:25:42 +00:00
José Roberto de Souza	961ca451e0	intel/brw: Add comment to ubo_ranges Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39175>	2026-01-07 14:25:42 +00:00
Georg Lehmann	eb4737a1dd	nir: add nir_alu_instr_is_exact helper Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39103>	2026-01-07 09:40:57 +00:00
Marek Olšák	1912a00a91	ALL: use SHA1_DIGEST_LENGTH etc. instead of hardcoding the numbers only build_id is switched to use literal 20 instead of SHA1_DIGEST_LENGTH because we will increase SHA1_DIGEST_LENGTH to BLAKE3_KEY_LEN Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39110>	2026-01-07 08:32:33 +00:00
José Roberto de Souza	6f031a98e0	intel/brw: Nuke brw_inst::is_volatile() There is no users for that function, is_volatile is only used in brw_opt_cse.cpp is_expression() but it access the information using brw_send_inst struct. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39104>	2026-01-05 14:11:47 +00:00
Georg Lehmann	f3290219ab	nir: use a seperate enum for per alu floating point math control We don't need one bit per bitsize per instruction if only one actually matters in the end. First step towards moving NIR in the direction of full float_controls2 only. Also rename this from fp_fast_math, because that name implied that 0 is the no fast math mode, while the opposite was the case. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39026>	2025-12-29 10:57:05 +00:00
Sushma Venkatesh Reddy	d1d4e3d530	brw: Add EU assembler support for float8 Decode logic in Gfx12+ has become complex with the new types, so Caio suggested that we move to the table like other gens. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39007>	2025-12-19 00:09:53 +00:00
Jordan Justen	0088aae481	intel/brw: Add new encode/decode for use with brw_data_type_float/int Rework: * Sushma: Add BF in brw_data_type_encode, brw_data_type_decode Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39007>	2025-12-19 00:09:53 +00:00
Jordan Justen	46e843f76e	intel/brw: Add brw_data_type_float/brw_data_type_int These type encodings were first were used in dpas instructions, but continue to be used in more places. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39007>	2025-12-19 00:09:52 +00:00
Sushma Venkatesh Reddy	54accefed2	brw: Add BRW_TYPE_BF8 and BRW_TYPE_HF8 for float8 Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39007>	2025-12-19 00:09:52 +00:00
Ian Romanick	b967942b64	brw: Do cmod prop again after scheduling Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details After selecting the scheduling mode, do cmod prop again. It's possible that doing cmod prop between performing a schedule and trying to register allocate would cause a different scheduling mode to be selected. However, this would require fully restoring the pre-schedule set of instructions (via cloning). I have tried to implement this, and it's harder than it looks. :( v2: Delete unused variable `progress`. Noticed by Marge. shader-db: All Intel platforms had similar results. (Meteor Lake shown) total instructions in shared programs: 19967018 -> 19967006 (<.01%) instructions in affected programs: 10652 -> 10640 (-0.11%) helped: 4 / HURT: 0 total cycles in shared programs: 884129990 -> 884139590 (<.01%) cycles in affected programs: 20334512 -> 20344112 (0.05%) helped: 0 / HURT: 4 fossil-db: Lunar Lake Totals: Instrs: 924967191 -> 924963460 (-0.00%); split: -0.00%, +0.00% Cycle count: 105962414958 -> 105961925594 (-0.00%); split: -0.00%, +0.00% Spill count: 3423582 -> 3423564 (-0.00%); split: -0.00%, +0.00% Fill count: 4877121 -> 4876955 (-0.00%); split: -0.00%, +0.00% Totals from 2511 (0.12% of 2018786) affected shaders: Instrs: 12541707 -> 12537976 (-0.03%); split: -0.03%, +0.00% Cycle count: 4816359238 -> 4815869874 (-0.01%); split: -0.01%, +0.00% Spill count: 179536 -> 179518 (-0.01%); split: -0.03%, +0.02% Fill count: 279407 -> 279241 (-0.06%); split: -0.07%, +0.01% Meteor Lake, DG2, Tiger Lake, Ice Lake, and Skylake had similar results. (Meteor Lake shown) Totals: Instrs: 980252404 -> 980237686 (-0.00%); split: -0.00%, +0.00% Cycle count: 91758669556 -> 91764028404 (+0.01%); split: -0.00%, +0.01% Spill count: 3664771 -> 3664744 (-0.00%); split: -0.00%, +0.00% Fill count: 4962078 -> 4960482 (-0.03%); split: -0.04%, +0.01% Totals from 8472 (0.38% of 2251522) affected shaders: Instrs: 34977623 -> 34962905 (-0.04%); split: -0.04%, +0.00% Cycle count: 6251857553 -> 6257216401 (+0.09%); split: -0.04%, +0.13% Spill count: 480251 -> 480224 (-0.01%); split: -0.01%, +0.00% Fill count: 676539 -> 674943 (-0.24%); split: -0.28%, +0.05% Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38315>	2025-12-18 15:15:20 -08:00
Ian Romanick	09450faf6a	brw: Do cmod prop again after post-RA scheduling shader-db: All Intel platforms had similar results. (Meteor Lake shown) total instructions in shared programs: 19968728 -> 19963825 (-0.02%) instructions in affected programs: 788014 -> 783111 (-0.62%) helped: 2503 / HURT: 0 total cycles in shared programs: 884112912 -> 884093268 (<.01%) cycles in affected programs: 20017168 -> 19997524 (-0.10%) helped: 1830 / HURT: 52 LOST: 0 GAINED: 6 fossil-db: All Intel platforms had similar results. (Meteor Lake shown) Totals: Instrs: 980768016 -> 980172179 (-0.06%) Cycle count: 91762351767 -> 91757280093 (-0.01%); split: -0.01%, +0.00% Max dispatch width: 37602592 -> 37608768 (+0.02%) Totals from 157150 (6.98% of 2251329) affected shaders: Instrs: 107323207 -> 106727370 (-0.56%) Cycle count: 12696754006 -> 12691682332 (-0.04%); split: -0.04%, +0.00% Max dispatch width: 3708584 -> 3714760 (+0.17%) Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38315>	2025-12-18 15:15:20 -08:00
Ian Romanick	08d71730ca	brw/cmod: Propagate to an instruction with same source Detect cases like mov.nz.f0.0(8) null<1>D g66<8,8,1>D (+f0.0) sel(8) g123<1>UD g87<8,8,1>UD g84<8,8,1>UD mov.nz.f0.0(8) null<1>D g66<8,8,1>D (+f0.0) sel(8) g124<1>UD g88<8,8,1>UD g85<8,8,1>UD Either MOV instruction could also be an equivalent CMP. v2: Require no predicate, groups match, and flags written match. v3: Add some more unit tests. Suggested by Caio. shader-db: All Intel platforms had similar results. (Lunar Lake shown) total instructions in shared programs: 17203627 -> 17203590 (<.01%) instructions in affected programs: 51432 -> 51395 (-0.07%) helped: 37 / HURT: 0 total cycles in shared programs: 879884982 -> 879884670 (<.01%) cycles in affected programs: 6014730 -> 6014418 (<.01%) helped: 25 / HURT: 4 fossil-db: Lunar Lake Totals: Instrs: 925092938 -> 925071952 (-0.00%); split: -0.00%, +0.00% Cycle count: 105972157149 -> 105966120894 (-0.01%); split: -0.01%, +0.00% Spill count: 3423592 -> 3423582 (-0.00%) Fill count: 4876743 -> 4877121 (+0.01%); split: -0.00%, +0.01% Max live registers: 193525293 -> 193525251 (-0.00%) Max dispatch width: 49047056 -> 49047088 (+0.00%); split: +0.00%, -0.00% Totals from 17714 (0.88% of 2018791) affected shaders: Instrs: 56708169 -> 56687183 (-0.04%); split: -0.04%, +0.00% Cycle count: 4560530879 -> 4554494624 (-0.13%); split: -0.15%, +0.01% Spill count: 434846 -> 434836 (-0.00%) Fill count: 807443 -> 807821 (+0.05%); split: -0.02%, +0.07% Max live registers: 4332542 -> 4332500 (-0.00%) Max dispatch width: 295248 -> 295280 (+0.01%); split: +0.02%, -0.01% Meteor Lake and DG2 had similar results. (Meteor Lake shown) Totals: Instrs: 995075628 -> 995051291 (-0.00%); split: -0.00%, +0.00% Cycle count: 92060967154 -> 92059311640 (-0.00%); split: -0.00%, +0.00% Spill count: 3664664 -> 3664675 (+0.00%); split: -0.00%, +0.00% Fill count: 4961929 -> 4961874 (-0.00%); split: -0.00%, +0.00% Max live registers: 121480292 -> 121480184 (-0.00%) Max dispatch width: 37947528 -> 37947496 (-0.00%) Totals from 20569 (0.90% of 2278279) affected shaders: Instrs: 57437989 -> 57413652 (-0.04%); split: -0.04%, +0.00% Cycle count: 4297505238 -> 4295849724 (-0.04%); split: -0.06%, +0.03% Spill count: 487508 -> 487519 (+0.00%); split: -0.00%, +0.00% Fill count: 869228 -> 869173 (-0.01%); split: -0.01%, +0.00% Max live registers: 2413028 -> 2412920 (-0.00%) Max dispatch width: 239280 -> 239248 (-0.01%) Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) Totals: Instrs: 1012570598 -> 1012546137 (-0.00%); split: -0.00%, +0.00% Cycle count: 85579989052 -> 85589116671 (+0.01%); split: -0.00%, +0.01% Spill count: 3901755 -> 3901748 (-0.00%) Fill count: 6799383 -> 6799367 (-0.00%) Max live registers: 122288761 -> 122288658 (-0.00%) Totals from 20595 (0.90% of 2280449) affected shaders: Instrs: 57764192 -> 57739731 (-0.04%); split: -0.04%, +0.00% Cycle count: 3899898675 -> 3909026294 (+0.23%); split: -0.04%, +0.27% Spill count: 481262 -> 481255 (-0.00%) Fill count: 1057996 -> 1057980 (-0.00%) Max live registers: 2412395 -> 2412292 (-0.00%) Skylake Totals: Instrs: 516619178 -> 516617390 (-0.00%) Cycle count: 57593545602 -> 57592502019 (-0.00%); split: -0.00%, +0.00% Fill count: 860403 -> 860402 (-0.00%) Max live registers: 87553761 -> 87553649 (-0.00%) Totals from 1357 (0.08% of 1730068) affected shaders: Instrs: 3575640 -> 3573852 (-0.05%) Cycle count: 1772148559 -> 1771104976 (-0.06%); split: -0.06%, +0.00% Fill count: 68917 -> 68916 (-0.00%) Max live registers: 131237 -> 131125 (-0.09%) Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38315>	2025-12-18 15:15:20 -08:00
Ian Romanick	50f2cd7366	brw/dce: Don't generate more NULL destinations after brw_lower_3src_null_dest Later commits will call DCE after lowering has been performed. Creating more things that would need lowering is problematic. No shader-db or fossil-db changes on any Intel platform. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38315>	2025-12-18 15:15:20 -08:00
Ian Romanick	24cd8aa3b8	brw/cmod: Allow FIXED_GRF Later commits will call cmod prop after register allocation. At that time, there is only FIXED_GRF. No shader-db or fossil-db changes on any Intel platform. v2: FIXED_GRF uses subnr instead of offset. Add a unit test to demonstrate the issue. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> [v1] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38315>	2025-12-18 15:15:20 -08:00
Ian Romanick	d7227b11a1	brw: elk: Disable can_do_cmod for MACH PRMs for G35 (Gfx4) through Ivy Bridge (Gfx7) all say that conditional modifiers are allowed for MACH. Starting with Haswell (Gfx7.5), this seems to be removed. This function doesn't have any way to know the platform, so false is returned for all platforms. No shader-db or fossil-db changes on any Intel platform. Prevents a failure in "brw: Do cmod prop again after post-RA scheduling" in piglit's builtin-uint-mad_sat-1.0.generated.cl. Cc: stable Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38315>	2025-12-18 15:15:20 -08:00
Ian Romanick	ba30794847	brw/cmod: Don't propagate between instructions in different groups The group implicity selects which flags the instruction can write. This was discovered while working on another set of changes that could change some logical operations into predicated MOV instructions. Prevents regressions later in the series in dEQP-VK.graphicsfuzz.cov-loop-fragcoord-identical-condition. No shader-db or fossil-db changes on any Intel platform. v2: Update the comment in the test case. Suggested by Caio. Fixes: `95ac3b1dae` ("i965/fs: don't propagate cmod when the exec sizes differ") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38315>	2025-12-18 15:15:20 -08:00
Ian Romanick	c0fb93506b	brw: Add brw_reg::is_grf v2: Add a function comment. Suggested by Caio. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38315>	2025-12-18 15:15:20 -08:00
Alyssa Rosenzweig	61dc9201a1	brw: constant fold before texture lowering This ensures we don't need dynamic stuff. Noticed when debugging weird regressions around the mcs lowering. ARL: total instructions in shared programs: 19857061 -> 19854964 (-0.01%) instructions in affected programs: 91768 -> 89671 (-2.29%) helped: 154 HURT: 0 helped stats (abs) min: 9.0 max: 33.0 x̄: 13.62 x̃: 13 helped stats (rel) min: 0.51% max: 40.91% x̄: 4.66% x̃: 3.36% 95% mean confidence interval for instructions value: -14.04 -13.19 95% mean confidence interval for instructions %-change: -5.49% -3.84% Instructions are helped. total cycles in shared programs: 884538769 -> 884485530 (<.01%) cycles in affected programs: 10508994 -> 10455755 (-0.51%) helped: 116 HURT: 38 helped stats (abs) min: 4.0 max: 15238.0 x̄: 666.22 x̃: 148 helped stats (rel) min: 0.01% max: 34.53% x̄: 2.58% x̃: 1.07% HURT stats (abs) min: 4.0 max: 4027.0 x̄: 632.68 x̃: 302 HURT stats (rel) min: 0.01% max: 32.75% x̄: 3.46% x̃: 0.59% 95% mean confidence interval for cycles value: -631.32 -60.09 95% mean confidence interval for cycles %-change: -2.06% -0.12% Cycles are helped. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39023>	2025-12-18 17:55:29 +00:00
Kenneth Graunke	d83c699045	brw: Convert GS pulled inputs to use URB intrinsics We leave GS pushed inputs using load_per_vertex_input for now - they're relatively simple, and using load_attribute_payload doesn't work well since it's assumed to be convergent (for TES, FS inputs) while GS inputs are divergent. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38990>	2025-12-18 06:39:02 +00:00
Kenneth Graunke	eae3bd19d4	brw: Move GS URB Read Length limiting to brw_nir_lower_gs_inputs() We're going to be deciding on push vs. pull in the NIR lowering pass soon, so move the code to limit our register usage from brw's thread payload code to brw_nir_lower_gs_inputs(). Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38990>	2025-12-18 06:39:02 +00:00
Kenneth Graunke	8889802271	brw: Make max_push_bytes a parameter to URB lowering data This allows us to program something other than a stage-based constant. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38990>	2025-12-18 06:39:02 +00:00
Kenneth Graunke	f62f7d80e2	brw: Update try_load_push_input to handle dword-unit offsets too We don't need this case today, but it's trivial to handle. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38990>	2025-12-18 06:39:01 +00:00
Caio Oliveira	9c16bbd023	brw: Perform mark_last_urb_write_with_eot optimization after CFG Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Avoid using exec_node::remove() and the initial "main list of instructions", and instead use the existing helpers like other passes. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37146>	2025-12-16 17:02:58 +00:00
Caio Oliveira	e53576a559	brw: Move MATH related validation Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Moved existing checks to EU validation and added a few more based on instruction description in the various PRMs / BSpec. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38877>	2025-12-16 01:34:46 +00:00
Caio Oliveira	55863c1267	brw: Add EU validation for ROR/ROL And remove asserts() in generator. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38877>	2025-12-16 01:34:46 +00:00
Caio Oliveira	47d8ed1177	brw: Move PLN/LINE normalization Add validation for Source 0 and move the normalization into the code producing the instruction. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38877>	2025-12-16 01:34:44 +00:00
Caio Oliveira	3f436bdc6e	brw: Make LINE normalization into validation Add validation for Source 0. Should not cause problems since this instruction is not used by the compiler anymore. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38877>	2025-12-16 01:34:43 +00:00
Caio Oliveira	75cf20f0eb	brw: Remove LINE from brw_builder and brw_generator Gfx9 only instruction that is not used anymore. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38877>	2025-12-16 01:34:42 +00:00
Caio Oliveira	cd3e3dd0d3	brw: Drop asserts for brw_SRND These are already covered by the EU validation. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38877>	2025-12-16 01:34:41 +00:00
Caio Oliveira	68190499df	brw: Move ADD related validation Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38877>	2025-12-16 01:34:40 +00:00
Caio Oliveira	6ae92d3372	brw: Move AVG related validation Couldn't find in the docs a reference for the types needing to match, and simulator + MTL seem fine with mixing UD and UW, so not adding a replacement for the removed assertions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38877>	2025-12-16 01:34:38 +00:00
Caio Oliveira	6d8d733d4d	brw: Move MUL related validation Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38877>	2025-12-16 01:34:34 +00:00
Kenneth Graunke	26523bedec	brw: Call nir_opt_offsets for mesh shaders Most stages call this as part of brw_nir_postprocess_opts() but mesh lowers to URB intrinsics after that since it needs bit-sizes lowered. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38918>	2025-12-16 00:58:46 +00:00
Kenneth Graunke	d831f38d11	brw: Delete all the old backend mesh/task URB handling code This has all been replaced by NIR lowering to URB intrinsics. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38918>	2025-12-16 00:58:46 +00:00
Kenneth Graunke	d0dc45955d	brw: Lower task shader payload access in NIR We keep this separate from the other lowering infrastructure because there's no semantic IO involved here, just byte offsets. Also, it needs to run after nir_lower_mem_access_bit_sizes, which means it needs to be run from brw_postprocess_opts. But we can't do the mesh URB lowering there because that doesn't have the MUE map. It's not that much code as a separate pass, though. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38918>	2025-12-16 00:58:46 +00:00
Kenneth Graunke	bd0c173595	brw: Lower mesh shader outputs in NIR With all the infrastructure in place, this is largely a matter of calling the lowering passes with the appropriate data from the MUE map. MUE initialization is now done with semantic IO instead of raw offsets. This drops another case of non-standard NIR IO usage (and no_validate). Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38918>	2025-12-16 00:58:44 +00:00
Kenneth Graunke	6e5cc63a3a	brw: Extend URB lowering infrastructure to handle mesh shader outputs Mesh shaders introduce per-primitive outputs, and also our MUE layout has per-vertex data starting at an offset. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38918>	2025-12-16 00:58:43 +00:00
Lionel Landwerlin	60db7f20c9	brw: move MUE initialization out of the SIMD loop Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38918>	2025-12-16 00:58:42 +00:00
Lionel Landwerlin	d3053fb3d2	brw: Implement URB handle intrinsics for task/mesh stages (Split by Ken from a larger patch originally written by Lionel.) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38918>	2025-12-16 00:58:40 +00:00
Kenneth Graunke	d18423b116	brw: Make lower_{inputs,outputs}_to_urb_intrinsics non-static I want to reuse these in brw_compile_mesh.cpp. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38918>	2025-12-16 00:58:40 +00:00
Kenneth Graunke	788c49ecc6	brw: Extend load_urb/store_urb to handle 32-bit non-vec4-aligned access (Based on the original implementation by Lionel Landwerlin, but adapted to my respun URB lowering framework.) The mesh shader URB payload requires reading and writing fields at arbitrary DWord offsets. For example, the Primitive Indices array starts at DWord 1, and it can be a vec1[], vec2[], or vec3[] array, leading to very unaligned and sometimes double-parked elements. Still, most fields are still conveniently vec4-aligned. To handle this, we add a new cb_data::vec4_access flag. If set, access remains in vec4 units, with vec4 alignment. We use this for non-mesh stages. When unset, offset is in 32-bit units, allowing unaligned DWord access. This is trivial to support on Xe2, where the LSC URB messages support arbitrary byte-aligned addressing. On older platforms, we have to convert this to vec4 aligned offsets plus a component offset (either returning a subset of the channels loaded, or using component masking to store a subset of a vec4/vec8). Thankfully, since the OWord URB messages support accessing a vec8 at a time, this means we can do any vec4 access in one message, even if it's double-parked. We use mod-analysis to see if we can statically determine the sub-vec4 component offset required (we often can). If not, we use the ability to have dynamic writemasks to sort it out. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38918>	2025-12-16 00:58:38 +00:00
Kenneth Graunke	2b700f6bfd	brw: Delete attr_desc struct Unused since commit `18bbcf9a63`. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38918>	2025-12-16 00:58:37 +00:00
Kenneth Graunke	8177695403	brw: Add missed access to store_urb_lsc_intel intrinsics I forgot to copy this over in the LSC case. This meant we were missing reorderability which meant that we were missing out on CSE. fossil-db results on Battlemage: Instrs: 231471427 -> 231363032 (-0.05%) Send messages: 12077759 -> 12019628 (-0.48%) Cycle count: 34058451430.0 -> 34057005552.0 (-0.00%); split: -0.01%, +0.00% Spill count: 520387 -> 520135 (-0.05%) Fill count: 470812 -> 470722 (-0.02%) Max live registers: 72111834 -> 71873886 (-0.33%) Totals from 2898 (0.37% of 788851) affected shaders: Instrs: 1223836 -> 1115441 (-8.86%) Send messages: 148633 -> 90502 (-39.11%) Cycle count: 17732554.0 -> 16286676.0 (-8.15%); split: -10.65%, +2.49% Spill count: 252 -> 0 (-inf%) Fill count: 90 -> 0 (-inf%) Max live registers: 491684 -> 253736 (-48.39%) Non SSA regs after NIR: 255397 -> 255402 (+0.00%) Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38918>	2025-12-16 00:58:36 +00:00

1 2 3 4 5 ...

4838 commits