fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-20 19:58:19 +02:00

Author	SHA1	Message	Date
Iván Briano	5b48805b42	brw: fix local_invocation_index with quad derivaties on mesh/task shaders For mesh/task shaders, the thread payload provides a local invocation index, but it's always linear so it doesn't give the correct value when quad derivatives are in use. The lowering pass where all of this is done correctly for compute shaders assumes load_local_invocation_index will be lowered in the backend for mesh/task, calculates the values for the quads correctly but then avoid replacing the original intrinsic and we remain with the wrong results. Add an intel specific intrinsic and always lower the generic one to that (or whatever else was calculated) to avoid ambiguities and fix the value for quad derivatives. Fixes future CTS tests using mesh/task shaders under: dEQP-VK.spirv_assembly.instruction.compute.compute_shader_derivatives.* Fixes: `d89bfb1ff7` ("intel/brw: Reorganize lowering of LocalID/Index to handle Mesh/Task") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39276>	2026-01-27 22:28:19 +00:00
Kenneth Graunke	41d7debcfe	brw: Use nir_imul_imm in per-vertex/per-primitive offset calculation Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This avoids generating some useless math that would need to be cleaned up later, without complicating things too much. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39250>	2026-01-27 16:08:36 +00:00
Kenneth Graunke	24c66d3871	brw: Vectorize URB intrinsics using nir_opt_load_store_vectorize This helps cut down URB messages on tessellation and mesh shaders significantly. fossil-db results on Battlemage: Instrs: 505172392 -> 505207187 (+0.01%); split: -0.00%, +0.01% Send messages: 23678197 -> 23656126 (-0.09%); split: -0.09%, +0.00% Cycle count: 63150470088 -> 63147482640 (-0.00%); split: -0.01%, +0.00% Spill count: 576554 -> 576616 (+0.01%) Fill count: 545304 -> 545413 (+0.02%) Max live registers: 141099192 -> 141150675 (+0.04%); split: -0.00%, +0.04% Max dispatch width: 39856192 -> 39856208 (+0.00%) Totals from 4231 (0.27% of 1583648) affected shaders: Instrs: 1620161 -> 1654956 (+2.15%); split: -0.25%, +2.40% Send messages: 128652 -> 106581 (-17.16%); split: -17.18%, +0.03% Cycle count: 24650700 -> 21663252 (-12.12%); split: -12.82%, +0.70% Spill count: 378 -> 440 (+16.40%) Fill count: 1308 -> 1417 (+8.33%) Max live registers: 364676 -> 416159 (+14.12%); split: -0.24%, +14.36% Max dispatch width: 67952 -> 67968 (+0.02%) There are several reasons we didn't go with nir_opt_vectorize_io: 1. nir_opt_vectorize_io appears to work on the slot location level. We want to be able to vectorize based on the URB offsets, especially for cases like point size, layer, and viewport which have different VARYING_SLOT_* values but live in the same vec4 in a URB entry. 2. We want vec8 stores, and nir_opt_vectorize_io only seems to vectorize within a single 32-bit vec4. It does handle 8 components, but that's only for packing 16-bit values into a 32-bit vec4. Improves performance of Sascha Willems' tessellation demo by around 4% on Meteorlake. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39250>	2026-01-27 16:08:36 +00:00
Kenneth Graunke	aafe8967fd	brw: Avoid using URB global offset with per-slot offsets on <= Icelake Both the URB Global Offset and Per-Slot Offsets are specified to be unsigned numbers. The URB Global Offset is only 11 bits, and so is limited to be between [0, 2047]. While the per-slot offsets are given as U32 values, it would appear that adding the two offsets does not handle 32-bit overflow/unsigned wrap correctly. This pops up in Piglit's TCS variable-indexing tests, which ends up performing loads from offset (x - 16) and a base of 18, and at an offset (x) with a base of 2. These should be equivalent, but when x <= 15, the per-slot offset calculated in the shader is negative (0xfffffff[0-f]) and adding the base of 18 is not wrapping around correctly to [2, 17]. To work around this, avoid using the global offset when the per-slot offset is present, and just add the two in the shader where unsigned wrap works correctly. Tigerlake and later don't seem to have this issue. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39250>	2026-01-27 16:08:36 +00:00
Kenneth Graunke	07ac0e3463	brw: Skip vec8 store_urb_vec4_intel noop writemasks as well We were checking for 0xf which is fine for vec4, but vec8 gets 0xff. Either way, nothing is writemasked, so we can skip sending the mask. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39250>	2026-01-27 16:08:36 +00:00
Kenneth Graunke	dbb24ff56b	brw: Assert that urb_vec4_intel stores only have 4/8 components vec1-3, 5-7, and 9+ are not supported. Only vec4 and vec8. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39250>	2026-01-27 16:08:36 +00:00
Kenneth Graunke	c2f03ba12f	nir: Add memory modes to URB load intrinsics This makes it easier for NIR passes to distinguish between inputs and outputs without having to reason about which URB handle source was passed to the intrinsic. It probably also makes it a bit easier for humans to read the NIR too. v2: Don't add memory mode to store intrinsics. It's always output. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39250>	2026-01-27 16:08:36 +00:00
Alyssa Rosenzweig	3361ca86cf	brw: hoist fsat lower OOTL Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39539>	2026-01-26 23:24:49 +00:00
Alyssa Rosenzweig	f16ec90caa	brw: move fsign lower OOTL Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39539>	2026-01-26 23:24:49 +00:00
Alyssa Rosenzweig	9c9680d16f	brw: use BITSET_LINEAR_ZALLOC Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39494>	2026-01-26 20:15:43 +00:00
Alyssa Rosenzweig	5409d872f7	brw: remove a redundant DCE Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39513>	2026-01-25 17:17:06 -08:00
Alyssa Rosenzweig	5fe71dc717	brw: combine more peephole select Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39513>	2026-01-25 17:17:06 -08:00
Alyssa Rosenzweig	b34806e357	brw: optimize bfi only late Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39513>	2026-01-25 17:17:06 -08:00
Alyssa Rosenzweig	c45c5440cd	brw: run nir_opt_idiv_const only once Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39513>	2026-01-25 16:17:30 -08:00
Alyssa Rosenzweig	a8b78e5a8c	brw: only optimize ray queries if there are any Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39513>	2026-01-25 16:17:30 -08:00
Alyssa Rosenzweig	7078496efe	brw: only optimize ray queries once Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39513>	2026-01-25 16:17:30 -08:00
Alyssa Rosenzweig	99d22bc35e	brw: run opt_deref only once Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39513>	2026-01-25 16:17:30 -08:00
Alyssa Rosenzweig	73fa431bff	brw: unloop post-mem vectorize opts Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details There's enough looping happening elsewhere for it to not really matter. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39514>	2026-01-25 23:41:41 +00:00
Alyssa Rosenzweig	11dba60e6e	brw: hoist lower_pack OOTL Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39514>	2026-01-25 23:41:41 +00:00
Alyssa Rosenzweig	3cfc431fb2	brw: remove redundant nir_opt_combine_stores Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39514>	2026-01-25 23:41:40 +00:00
Alyssa Rosenzweig	ced1adcad7	brw: move nir_opt_memcpy OOTL Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39514>	2026-01-25 23:41:40 +00:00
Caio Oliveira	74f1d4f47b	intel/compiler: Use SPDX annotations Minor adjustments to formatting of the copyright line, but keep dates and holders. "Authors" entries that could be obtained via Git logs were also removed. The license in brw_disasm.c and elk_disasm.c don't match directly any SPDX pattern I could find, so kept as is. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39503>	2026-01-24 20:37:31 +00:00
Caio Oliveira	dc352f3d7c	brw: Don't increment block loads addresses unless needed Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39447>	2026-01-24 04:38:23 +00:00
Sushma Venkatesh Reddy	0ce4e8ba6f	brw: Use lookup tables for Gfx12+ 3src type encoding/decoding Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details The previous Gfx12+ implementation using bit masking is failing for FP8 types, so replacing with explicit lookup tables. For float types, the encoding now aligns with brw_data_type_float, ensuring correct behavior for DPAS and other 3-source instructions. Fixes: `d1d4e3d530` ("brw: Add EU assembler support for float8") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39448>	2026-01-24 01:37:12 +00:00
Caio Oliveira	9c602503f6	brw: Remove block_list in favor of blocks array Code kept track of blocks both in a linked list and in an array. Change the client code of the list to just use the array so we just maintain one. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39246>	2026-01-24 01:15:52 +00:00
Caio Oliveira	e44ccaa5cf	brw: Remove foreach_block_safe / reverse_safe The code currently don't remove blocks, when a block is about to become empty, the code will replace the last instruction with a NOP. If we want to have actual block removals again, there are other strategies than removing them as we iterate (e.g. allow empty blocks and then collect them in a pass or right after iteration). So remove those macros. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39246>	2026-01-24 01:15:52 +00:00
Caio Oliveira	bf822495fe	brw: Remove tabs from brw_cfg.cpp Use spaces like rest of the code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39246>	2026-01-24 01:15:51 +00:00
Caio Oliveira	9994db58b3	brw: Remove global variables from brw_asm parser Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Use the same features as other Flex/Bison parsers in the codebase. Acked-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39363>	2026-01-23 23:13:31 +00:00
Caio Oliveira	1db92ee9fc	brw: Move the brw_codegen inside brw_asm_parser Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39363>	2026-01-23 23:13:31 +00:00
Caio Oliveira	ad6a342d42	brw: Move brw_last_inst macro to assembler Change the few other cases to an inline function that does the same job. This macro will change in ways that are not compatible with the non-assembler usages. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39363>	2026-01-23 23:13:31 +00:00
Caio Oliveira	a5fac4e084	brw: Create a struct to hold parser state Hold most of the parser data. Remaining will be moved in follow-up patches. The struct itself is still a global for now. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39363>	2026-01-23 23:13:31 +00:00
Calder Young	895ff7fe92	Revert "anv,brw: Allow multiple ray queries without spilling to a shadow stack" Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This optimization doesn't work when the ray query index isn't uniform across the subgroup, which is something the spec allows. While there are some smart ways to fix this and still avoid unnecessary spilling, its not worth investing the time until we find a realtime raytracing workload that actually needs to use multiple live ray queries for something. Fixes: `1f1de7eb` ("anv,brw: Allow multiple ray queries without spilling to a shadow stack") Acked-by: Sagar Ghuge <sagar.ghuge@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39445>	2026-01-23 21:33:55 +00:00
Sagar Ghuge	6aa3b70382	anv: Mark RootNodeOffset at 256B always This commit change the BVH layout a little so that we can load the BVH offset as constant rather than reading from memory. We have to force the instance leaves pointer at the end which gets used in copy.comp shader. Totals: Instrs: 54798 -> 54728 (-0.13%) Send messages: 3854 -> 3847 (-0.18%) Cycle count: 1915106 -> 1913954 (-0.06%); split: -0.07%, +0.01% Non SSA regs after NIR: 18594 -> 18575 (-0.10%) Totals from 7 (7.37% of 95) affected shaders: Instrs: 5532 -> 5462 (-1.27%) Send messages: 367 -> 360 (-1.91%) Cycle count: 132592 -> 131440 (-0.87%); split: -1.01%, +0.14% Non SSA regs after NIR: 1989 -> 1970 (-0.96%) PERCENTAGE DELTAS Shaders Instrs Send messages Cycle count Non SSA regs after NIR q2rtx-rt-pipeline 95 -0.13% -0.18% -0.06% -0.10% -------------------------------------------------------------------------------------- All affected 7 -1.27% -1.91% -0.87% -0.96% -------------------------------------------------------------------------------------- Total 95 -0.13% -0.18% -0.06% -0.10% Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39106>	2026-01-22 23:20:04 +00:00
Caio Oliveira	c8375c0f71	brw/scoreboard: Support local implicit out-of-order dependencies In software scoreboard (Gfx12+) use information from previous instructions to trim out-of-order dependencies. For example, in send g1, g2 ($1) mov g3, g1 ($1.dst) // Depends on g1 (destination of $1) mov g4, g2 ($1.src) // Depends on g2 (source of $1) mov g5, g1 ($1.dst) // Depends on g1 (destination of $1) only the first `mov` needs to be annotated, because the execution will stall until that dependency is fulfilled, which in this case means the `send` is done and `g1` was already written. Note that while `$x.dst` implies `$x.src`, the reverse is not true, so if the first `mov` did not exist, both second and third `mov` in the example would have to keep their annotations. This patch add resolution of implicit out-of-order dependencies that are visible inside a block. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3526>	2026-01-21 22:29:28 +00:00
Caio Oliveira	ba317e14a0	brw: Provide ~ and &= operators for tgl_sbid_mode Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3526>	2026-01-21 22:29:28 +00:00
Caio Oliveira	2ebacbc78d	brw/scoreboard: Add tests showing implicit unordered dependencies in SWSB Mark tests as disabled for now. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3526>	2026-01-21 22:29:28 +00:00
Caio Oliveira	423916152e	brw/scoreboard: Use std::vector when applicable There's agreement now these are helpful and widely supported. We can always fallback to a custom vector class later if necessary. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3526>	2026-01-21 22:29:27 +00:00
Lionel Landwerlin	79aff6e274	brw: use fp64 to compute coarse_z For some reason we cannot get the precision needed from the HW at fp32. LNL internal fossildb changes : Totals from 7226 (0.76% of 947978) affected shaders: Instrs: 5512598 -> 5586086 (+1.33%); split: -0.00%, +1.33% Cycle count: 153836056 -> 155079472 (+0.81%); split: -0.77%, +1.58% Spill count: 2025 -> 2021 (-0.20%); split: -0.35%, +0.15% Fill count: 3139 -> 3112 (-0.86%); split: -1.12%, +0.25% Max live registers: 1034601 -> 1034632 (+0.00%); split: -0.00%, +0.00% Max dispatch width: 207296 -> 207264 (-0.02%); split: +0.02%, -0.03% Non SSA regs after NIR: 1147942 -> 1109326 (-3.36%) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12726 Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38996>	2026-01-21 16:00:52 +00:00
Lionel Landwerlin	a19e949824	brw: move coarse_z computation to NIR So that we can print it easily with debug printfs Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38996>	2026-01-21 16:00:52 +00:00
Lionel Landwerlin	89a53f048a	brw: make coarse pixel bit available to NIR lowering Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38996>	2026-01-21 16:00:51 +00:00
Lionel Landwerlin	e3fd1b0ac0	brw: populate wm_prog_data earlier So that we can put the coarse_pixel_dispatch value available to NIR lowering. LNL internal fossildb changes: Totals from 40 (0.01% of 490838) affected shaders: Instrs: 33321 -> 33311 (-0.03%); split: -0.04%, +0.01% Cycle count: 780136 -> 779936 (-0.03%); split: -0.03%, +0.00% Max live registers: 5292 -> 5298 (+0.11%) Non SSA regs after NIR: 26638 -> 26464 (-0.65%) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38996>	2026-01-21 16:00:51 +00:00
Lionel Landwerlin	6a7ff83874	brw: set nir_shader_compiler_options::has_pixel_coord Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38996>	2026-01-21 16:00:50 +00:00
Lionel Landwerlin	3d2a696763	brw: treat inline parameters like UNIFORM Makes a bunch of copy propagation and other passes work much better. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39382>	2026-01-20 21:25:53 +00:00
Lionel Landwerlin	1d1866a84b	brw: apply same workaround to spawn than trace opcode Working around BRW's limitations Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39382>	2026-01-20 21:25:52 +00:00
Lionel Landwerlin	0e9453291c	brw: improve push constant loading using base offsets Xe2+ only Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39382>	2026-01-20 21:25:52 +00:00
Lionel Landwerlin	c1ef494b08	brw: add missing base offset decoding Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39382>	2026-01-20 21:25:52 +00:00
Georg Lehmann	050507ab81	brw: make sure nir_opt_algebraic_late was called after late brw_nir_optimize Not only is it questionable for code quality to not call nir_opt_algebraic_late after nir_opt_algebraic, it also breaks correctness for late lowerings. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39180>	2026-01-19 16:11:28 +00:00
Alyssa Rosenzweig	a11aa3fc4e	brw: combine peephole select calls Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39361>	2026-01-16 21:24:15 +00:00
Calder Young	d69daf28d0	anv,brw: Add helper to get stack ids per dss for ray queries Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38778>	2026-01-16 09:21:50 +00:00
Calder Young	1f1de7ebd6	anv,brw: Allow multiple ray queries without spilling to a shadow stack Allows a shader to have multiple ray queries without spilling them to a shadow stack. Instead, the driver provides the shader with an array of multiple RTDispatchGlobals structs to give each query its own dedicated stack. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38778>	2026-01-16 09:21:50 +00:00

1 2 3 4 5 ...

4906 commits