fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-24 08:50:13 +01:00

Author	SHA1	Message	Date
Iván Briano	ac182d6045	brw/mesh: drop brw_tue_map::per_task_data_start_dw It's always set to a fixed value and not used in many places. Use the value directly where it's needed. Suggested-by: Lucas Fryzek <lfryzek@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37648>	2025-10-03 17:36:43 +00:00
Iván Briano	e624174134	anv: handle compiling of mesh shader separately from task shader With EXT_shader_object, it became possible to compile shaders independently and then use them together later, so we cannot rely on the lack of task shader data to decide that no task shader will be used. The flag VK_SHADER_CREATE_NO_TASK_SHADER_BIT_EXT exists for that purpose, but it doesn't really make any difference for us. Always assume that if the mesh shader is reading the task payload, it's going to be used with one, as otherwise the application is doing it wrong. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13983 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37648>	2025-10-03 17:36:43 +00:00
Kenneth Graunke	29d30c6f3d	brw: Only skip SIMD widths based on pressure if an smaller one compiled Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Sometimes the compute shader workgroup size requires a larger SIMD width than the minimum in order to fit in the available threads. In that case we'll skip the SIMD8 shader, and need to try SIMD16 regardless of how the register pressure estimate looks. Fixes: `3af4e63061` ("brw: Skip compilation of larger SIMDs when pressure is too high") Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37649>	2025-10-02 16:17:26 -07:00
Alyssa Rosenzweig	c2ae207e80	brw,anv: use XML-based stats I didn't bother switching either iris or elk/hasvk but one could. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37517>	2025-10-02 20:22:00 +00:00
José Roberto de Souza	c008d21947	intel/brw: Move brw_s0() to brw_reg.h It remove a duplication and also it will be used in a future patch from other file. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37670>	2025-10-02 10:46:10 -07:00
Kenneth Graunke	3af4e63061	brw: Skip compilation of larger SIMDs when pressure is too high This allows us to skip the entire backend compilation process for large SIMD widths when register pressure is high enough that we'd likely decide to prefer a smaller one in the end anyway. The hope is to make the same decisions as before, but with less CPU overhead. We are making mostly the same decisions as before: \| API / Platform \| Total Shaders \| Changed \| % Identical -------------------------------------------------- \| VK / Arc A770 \| 905,525 \| 1,157 \| 99.872% \| \| VK / Arc B580 \| 788,127 \| 53 \| 99.993% \| \| VK / Panther \| 786,333 \| 13 \| 99.998% \| \| GL / Arc A770 \| 308,618 \| 269 \| 99.913% \| \| GL / Arc B580 \| 264,066 \| 13 \| 99.995% \| \| GL / Panther \| 273,212 \| 0 \| 100.000% \| Improves compile times on my i7-12700K: \| Game \| Arc B580 \| Arc A770 \| --------------------------------------------------- \| Assassins Creed: Odyssey \| -13.47% \| -10.98% \| \| Borderlands 3 (DX12) \| -10.05% \| -11.31% \| \| Dark Souls 3 \| -21.06% \| -21.08% \| \| Oblivion Remastered \| -11.10% \| -9.82% \| \| Phasmophobia \| -32.73% \| -31.00% \| \| Red Dead Redemption 2 \| -20.10% \| -14.38% \| \| Total War: Warhammer III \| -10.11% \| -14.44% \| \| Wolfenstein Youngblood \| -15.91% \| -13.47% \| \| Shadow of the Tomb Raider \| -30.23% \| -25.86% \| It seems to have nearly no effect on compile times on Xe3 unfortunately, as only 1,014 shaders in fossil-db even fail SIMD32 compilation in the first place, and we want to let most of the "might succeed" cases through to the backend for throughput analysis. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36750>	2025-09-30 19:44:03 +00:00
Kenneth Graunke	248050b6d0	brw: Add a quick NIR-based register pressure estimate pass This tries to calculate an underestimate (lower bound) for the register pressure at various SIMD widths, by counting live values in the NIR shader. This fundamentally won't be accurate, but it can give us an idea of whether it's even worth trying a certain SIMD-width compile. Doing this at the NIR level means we: - Can use SSA structure rather than fuzzy liveness intervals - Can avoid the backend scheduler aggressively trying to hide latency, presenting an overinflated view of the register pressure - Have divergence information on-hand, making it easier to "scale up" - Can skip cloning and optimizing NIR for compute shader SIMD widths Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36750>	2025-09-30 19:44:03 +00:00
Kenneth Graunke	5ebd766156	brw: Do most of NIR postprocessing before cloning for SIMD variants We were doing a lot of NIR work repeatedly for each SIMD variant of compute and mesh shaders. Instead, do it once before cloning, and just do one final optimization loop and out-of-SSA for each. fossil-db results on Arc B580: Totals: Instrs: 233771096 -> 233794024 (+0.01%); split: -0.01%, +0.02% Subgroup size: 15922768 -> 15922736 (-0.00%); split: +0.00%, -0.00% Send messages: 12095619 -> 12098234 (+0.02%); split: -0.00%, +0.02% Loop count: 137562 -> 137523 (-0.03%) Cycle count: 32600323744 -> 32667411252 (+0.21%); split: -0.06%, +0.27% Spill count: 540908 -> 542027 (+0.21%); split: -0.07%, +0.28% Fill count: 700938 -> 698983 (-0.28%); split: -0.73%, +0.45% Scratch Memory Size: 37266432 -> 37304320 (+0.10%); split: -0.10%, +0.20% Max live registers: 72691728 -> 72692987 (+0.00%); split: -0.00%, +0.00% Non SSA regs after NIR: 67690309 -> 67688352 (-0.00%); split: -0.01%, +0.00% Totals from 3576 (0.45% of 789301) affected shaders: Instrs: 6932956 -> 6955884 (+0.33%); split: -0.41%, +0.74% Subgroup size: 88816 -> 88784 (-0.04%); split: +0.09%, -0.13% Send messages: 329168 -> 331783 (+0.79%); split: -0.02%, +0.81% Loop count: 8753 -> 8714 (-0.45%) Cycle count: 15153678820 -> 15220766328 (+0.44%); split: -0.14%, +0.58% Spill count: 213751 -> 214870 (+0.52%); split: -0.18%, +0.71% Fill count: 282616 -> 280661 (-0.69%); split: -1.82%, +1.13% Scratch Memory Size: 13056000 -> 13093888 (+0.29%); split: -0.27%, +0.56% Max live registers: 834757 -> 836016 (+0.15%); split: -0.11%, +0.26% Non SSA regs after NIR: 995033 -> 993076 (-0.20%); split: -0.48%, +0.28% Looking at a few of the shaders with substantial instruction count increases, it appears that it is largely due to more loops being unrolled, which is probably actually a good thing. The compile time impact of this patch appears to be negligable. However, doing postprocessing before SIMD cloning allows us to examine the postprocessed SSA-form NIR for improvements in an upcoming patch. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36750>	2025-09-30 19:44:02 +00:00
Kenneth Graunke	0712c220ab	brw: Split brw_postprocess_nir() into two pieces brw_postprocess_nir contains a lot of stuff these days. The first part does a bunch of lowering and cleanup optimizations in SSA form. The second part does some post-optimization lowering and the out-of-SSA conversion. We may want to do additional work before the post-optimization/post-SSA phase. Splitting this allows us to insert such tasks in the "middle". For convenience, brw_postprocess_nir() becomes a wrapper which invokes both parts, so callers can continue working as they did until they have a reason to do otherwise. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36750>	2025-09-30 19:44:02 +00:00
Kenneth Graunke	71b513a1e9	brw: Lower certain subgroup size modes in brw_preprocess_nir This allows us to lower known subgroup size cases earlier, giving us some earlier optimization opportunities. We would need to know the actual SIMD width to handle certain cases, but we can just pass 0 here, which will lead to get_subgroup_size returning 0 - the same as leaving this unset. We can come back to that later during the per-SIMD-width postprocessing. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36750>	2025-09-30 19:44:02 +00:00
Kenneth Graunke	3e493e03cc	brw: Move "SSA form" printing to after divergence analysis is run We were printing the SSA form, then immediately running divergence analysis. This patch flips those, so we can see con/div in INTEL_DEBUG output for SSA form, which is really useful. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36750>	2025-09-30 19:44:02 +00:00
Kenneth Graunke	1b0808adf3	intel/nir: Make ffma peephole optimization preserve fp_fast_math flags float_controls2 may have marked these as needing to preserve NaN or other values. If so, our newly contracted ffma needs to as well. Fixes dEQP-VK.spirv_assembly.instruction.compute.float_controls2..input_args.mat_det_testedWithout_NotNan when nir_opt_algebraic is run after this pass. Cc: mesa-stable Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36750>	2025-09-30 19:44:02 +00:00
Ian Romanick	23bd356b42	brw/nir: nir_intrinsic_load_reloc_const_intel may not be scalar [v3] Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details If the (NIR) destination is a register (i.e., not an SSA value), the destination of the BRW instruction will not be is_scalar. This occurs in some shaders in Final Fantasy XVI (and finalfantasytype0_1.rdc.2826e29da3722a83.1.foz). If the destination is not is_scalar, revert most of this code to the state previous to `f3593df877`. This means - Allocate a SIMD1 register and UNDEF it. - Emit a SIMD1 MOV_RELOC_IMM to that register. - Emit an additional MOV to expand the SIMD1 result. Closes: #12520 Fixes: `f3593df877` ("brw/nir: Treat load_reloc_const_intel as convergent") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37384>	2025-09-29 16:48:07 +00:00
José Roberto de Souza	141a225ca1	intel/brw: Use ASR over SHR for SHADER_OPCODE_ISUB_SAT src[1]/src0 is signed and Xe2+ SHR don't support operations over signed data types so lets switch this over ASR that supports signed data types. Cc: mesa-stable Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37557>	2025-09-26 16:44:24 +00:00
Tim Van Patten	f90e0f0797	intel: Convert getenv() to os_get_option() Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details os_get_option() is a wrapper for getenv() that checks properties in Android. It should be a no-op for other OS but will allow full use of env vars in Android. The environment variable names are automatically renamed by os_get_option() and the order of precedence thus becomes: 1. getenv (non-Android) 2. debug.mesa.* (Android) 3. vendor.mesa.* (Android) 4. mesa.* (Android, as a fallback for older versions) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37587>	2025-09-25 17:01:18 -06:00
Caio Oliveira	f011e5707d	brw: Identify if/break/endif special case before emission Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37147>	2025-09-25 06:36:10 +00:00
Caio Oliveira	9f6155e47d	brw: Also include the final disassembly in the debug archive This doesn't replace existing support for INTEL_DEBUG=shaders -- so both `shaders` and `mda` can be used. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29146>	2025-09-24 23:08:45 -07:00
Caio Oliveira	cdef824b7a	brw: Include some NIR states in the debug archive Acked-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29146>	2025-09-24 23:08:45 -07:00
Caio Oliveira	f82d85a685	brw: Use debug archive file with INTEL_DEBUG=mda Instead of dumping multiple files with the optimizer passes, write a single archive file with all the contents. The actual file is created by the drivers, so later commits will actually enable the feature in anv and iris. This removes the use of INTEL_DEBUG=optimizer (and the corresponding enum value) in brw. That environment variable is still used by ELK -- which currently doesn't support mda. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29146>	2025-09-24 23:08:45 -07:00
Sushma Venkatesh Reddy	a1c5f1ccf6	intel/compiler: Validation for SRND instructions Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36529>	2025-09-24 17:18:37 +00:00
Sushma Venkatesh Reddy	fe6d364ca8	brw: Add assembler support for SRND Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36529>	2025-09-24 17:18:37 +00:00
Sushma Venkatesh Reddy	51f4a2572a	intel/compiler: Initial bits for SRND instruction Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36529>	2025-09-24 17:18:37 +00:00
Eric Engestrom	2f9fd1768a	intel/meson: generate spirv_info.h before compiling brw_spirv.c Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37544>	2025-09-24 10:23:18 +00:00
Lionel Landwerlin	e9910fa955	brw: fix type conversion in tex operation params Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Fix a bunch of tests in dEQP-VK.glsl.texture_gather.* on Xe2+ Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `bddfbe7fb1` ("brw/blorp: lower MCS fetching in NIR") Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37532>	2025-09-24 08:47:03 +00:00
Lionel Landwerlin	8e93e7cd72	brw: layout patch in VUE in position independent way Only if required. I somehow misunderstood that those would need to be independent too, not just the vertex slots. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `8dee4813b0` ("brw: add ability to compute VUE map for separate tcs/tes") Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37251>	2025-09-23 16:01:30 +00:00
Lionel Landwerlin	73383fe7ef	brw: fix split_sends with txf combining Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37527>	2025-09-23 15:37:40 +00:00
Lionel Landwerlin	6dbcc81c85	brw: simplify texture surface/sampler handle sources We had twice surface/sampler sources for no good reason, just add a boolean to tell whether they are bindless or not. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37527>	2025-09-23 15:37:40 +00:00
Lionel Landwerlin	06cf911ab4	brw: lower shader opcode into tex_instr Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37527>	2025-09-23 15:37:40 +00:00
Lionel Landwerlin	bddfbe7fb1	brw/blorp: lower MCS fetching in NIR One advantage here of moving a bunch of stuff to NIR is that we can now have consistent payload types straight from the NIR conversion to BRW. This massively simplifies the BRW lowering code and avoids type errors that are quite common to make in the backend. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37527>	2025-09-23 15:37:40 +00:00
Lionel Landwerlin	d4ab2087cf	brw: lower non coherent FS load_output in NIR Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37527>	2025-09-23 15:37:39 +00:00
Ian Romanick	3e04990c68	elk: Increase the size of some structure fields in combine_constants Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details In very large shaders, first_use_ip, last_use_ip, and even (register) nr can overflow 16 bits. Increase the size of these fields. Some structure components are rearranged to promote better packing. Fixes: `2dad1e3abd` ("i965/fs: Add pass to combine immediates.") Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37482>	2025-09-22 20:02:25 +00:00
Ian Romanick	b7e1ac8309	brw: Increase the size of some structure fields in combine_constants In very large shaders, first_use_ip, last_use_ip, and even (register) nr can overflow 16 bits. Increase the size of these fields. used_in_single_block is moved earlier in the structure to promote better packing. Fixes: `2dad1e3abd` ("i965/fs: Add pass to combine immediates.") Closes: #9489 Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Tested-by: Tapani Pälli <tapani.palli@intel.com> Tested-by: @joostruis Tested-by: @Snoucher Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37482>	2025-09-22 20:02:25 +00:00
Caio Oliveira	f65fbb23e2	brw: Fix encoding of 3-src dst in Xe2+ Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Use FD20 macro that will account for the implicit LSB zero value and is already used for sources. For the new macro we need to use the entire bit-range of the field (55-51), so remove the adjustments we used to do prior to encoding and decoding. Fixes assertion in vkpeak (https://github.com/nihui/vkpeak) when running bf16 tests on BMG. And the code now will correctly apply the subreg_nr to the destination, e.g. a mad(32) gets splitted into two pieces, the generation would not fill out the upper-part of the register ``` mad(16) g13<1>BF g10<8,8,1>BF g12<8,8,1>BF g56<1,1,1>F { align1 1H A@5 }; -mad(16) g13<1>BF g10.16<8,8,1>BF g12.16<8,8,1>BF g57<1,1,1>F { align1 2H A@5 }; +mad(16) g13.16<1>BF g10.16<8,8,1>BF g12.16<8,8,1>BF g57<1,1,1>F { align1 2H A@5 }; ``` Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37236>	2025-09-18 18:21:25 +00:00
Alyssa Rosenzweig	804ced9047	intel: drop legacy flatshade handling Let mesa/st do the keying instead. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37447>	2025-09-18 14:14:11 +00:00
Alyssa Rosenzweig	36bd06ebab	intel: drop clamp_fragment_color handling This is all dead code since we weren't even seting the cap in iris/crocus! Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37447>	2025-09-18 14:14:11 +00:00
Alyssa Rosenzweig	957f326a10	brw: drop printf info plumbing unused since printf hashing. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37447>	2025-09-18 14:14:10 +00:00
Alyssa Rosenzweig	bbf5bc8632	brw: cleanup int64 option set Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37447>	2025-09-18 14:14:09 +00:00
Alyssa Rosenzweig	168704c2fe	brw: hoist shared options out of the stage loop ideally we'd have no stage switching, but this is just a cleanup for now. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37447>	2025-09-18 14:14:09 +00:00
Alyssa Rosenzweig	0d7083d5bc	brw: drop indirection on compiler options I see no point, we allocate for every shader stage anyway. This is a bit simpler. I'm not a fan of the brw_compiler singleton at all but torching that is not on today's agenda. Flattening it a little bit very much is. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37447>	2025-09-18 14:14:08 +00:00
Alyssa Rosenzweig	2c161cc35d	brw: drop unused brw_kernel code unused since we dropped GRL. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37447>	2025-09-18 14:14:07 +00:00
Georg Lehmann	714a149396	nir: remove unsigned upper bound config All config information is now either in nir->info or nir->options. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37361>	2025-09-16 09:24:04 +00:00
Lionel Landwerlin	a69853ce5e	brw: improve eot_reg computation in register allocate Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `c4c7ff3f8f` ("brw: enable register allocation to deal with multiple EOTs") Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37326>	2025-09-16 07:49:07 +00:00
Lionel Landwerlin	1f86a4ee37	brw: remove unused RT write code With `4fda724fd4` ("brw: Avoid invalid access when compacting out-of-bounds JIP/UIP") this stuff isn't needed anymore. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `fe38fb858c` ("brw: workaround broken indirect RT messages on Gfx11") Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37326>	2025-09-16 07:49:07 +00:00
Francisco Jerez	5c68b351fe	intel/brw: Fix regression in brw_allocate_registers() compiling large shaders with throughput==0. The following Vulkan CTS tests that emit massive shaders were regressing after "intel/brw/xe3+: Select scheduler heuristic with best trade-off between register pressure and latency.": dEQP-VK.graphicsfuzz.cov-nested-loops-set-struct-data-verify-in-function dEQP-VK.graphicsfuzz.cov-dfdx-dfdy-after-nested-loops The reason is that they have so many nested loops that they cause the performance analysis utilization estimates to overflow the 32-bit floating-point variables used to calculate them, which causes our throughput estimate to underflow and equal zero for those shaders, which breaks the logic introduced in brw_allocate_registers() to select the scheduling variant with highest throughput, since none of the scheduling modes tried has better throughput than the initial value equal to zero of "best_perf". Instead use -INFINITY as initial value for "best_perf" so we always select a scheduling mode. This should have been caught by CI but oddly the tests above are showing up as "not run" on my last baseline runs, so this wasn't flagged as a regression for me. v2: Use -INFINITY instead of previous approach that used NaN (Ian). Fixes: `531a34c7dd` ("intel/brw/xe3+: Select scheduler heuristic with best trade-off between register pressure and latency.") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13884 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13885 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v1) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37322>	2025-09-15 21:10:47 +00:00
Sushma Venkatesh Reddy	5f10c1a8fb	intel/compiler: generalize workaround script name for broader applicability Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Renamed brw_nir_trig_workarounds.py to brw_nir_workarounds.py to reflect its expanded scope beyond just trignometric workarounds. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36990>	2025-09-12 22:32:46 +00:00
Sushma Venkatesh Reddy	fe1d84e083	intel/compiler: apply sqrt workaround for Horizon Forbidden West shader Added a workaround for a known shader in Horizon Forbidden West that causes visual corruption on Intel anv driver. The fix clamps fsqrt inputs using fmax(x, 1e-12) to avoid invalid values. Integrated the workaround via brw_nir_apply_sqrt_workarounds() and applied it conditionally in the Vulkan pipeline based on the shader's BLAKE3 hash. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12555 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36990>	2025-09-12 22:32:46 +00:00
Georg Lehmann	79d02047b8	intel: switch to new subgroup size info Reviewed-by: Iván Briano <ivan.briano@intel.com> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37258>	2025-09-12 21:05:17 +00:00
Georg Lehmann	95c2a65662	nir: remove unused shader_info param in nir_create_shader Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37258>	2025-09-12 21:05:17 +00:00
Caio Oliveira	c358842c1d	brw: Don't use individual rallocs for each instruction Move from a single ralloc allocation per instruction to contiguous blocks of allocations. Still use ralloc for those large blocks. Each ralloc allocation has at least 5 pointers of overhead, which would be about a third of the current brw_inst, and get worse as we try to pack brw_inst better. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36730>	2025-09-12 00:25:05 +00:00
Caio Oliveira	2506540566	brw: Repack brw_inst fields In Release build, goes from 72 to 64 bytes, and now fits in a single cacheline. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36730>	2025-09-12 00:25:05 +00:00

1 2 3 4 5 ...

4616 commits