fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-24 13:10:10 +01:00

Author	SHA1	Message	Date
Emma Anholt	c00ebca5c4	ir3: Improve spilling of NIR vars to scratch. Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Previously, we would spill at the NIR level any temp array over 16 vec4s. This had two problems: 1) We wouldn't spill for the worst case scenario: a MAD accessing a dst array and 3 different src arrays (that all get fully unspilled, rather than just reloading the specific reg in the operand). This would fail to register allocate. We haven't seen this in practice. 2) We would spill vec4[17] and larger arrays that weren't necessary to get the shader to register allocate. This occurred on a FS for in Stray that had a vec4[24] array and just 4 vec4s of register pressure other than the array. Instead, use NIR scratch spilling when the worst case set of vars to reference in an instruction would overflow GPR space. This makes the shader in Stray go from 11ms to .5ms, by eliminating all spilling and leaving the array in GPRs. On the other hand, if leaving the arrays unspilled in NIR means that we cause spilling in ir3, the fact that ir3 spills/reloads work on the whole array may cause the amount of spilling to increase. However, we can see the effect is very small in terms of number of shaders affected in shader-db and an overwhelmingly positive effect on spills: MaxWaves: 22522470 -> 22520664 (-0.01%) Instrs: 396093281 -> 396122221 (+0.01%); split: -0.00%, +0.01% STPs: 218915 -> 182907 (-16.45%) LDPs: 155374 -> 153364 (-1.29%); split: -2.79%, +1.50% Totals from 496 (0.03% of 1561298) affected shaders: MaxWaves: 3792 -> 1986 (-47.63%) Instrs: 441224 -> 470164 (+6.56%); split: -0.00%, +6.57% CodeSize: 926164 -> 976734 (+5.46%); split: -0.05%, +5.52% NOPs: 58896 -> 52765 (-10.41%); split: -14.95%, +4.60% MOVs: 16314 -> 57901 (+254.92%) COVs: 3293 -> 5146 (+56.27%) Full: 12876 -> 23632 (+83.54%) (ss): 18613 -> 11573 (-37.82%); split: -47.53%, +9.71% (sy): 2539 -> 2505 (-1.34%); split: -10.75%, +9.41% (ss)-stall: 40682 -> 26413 (-35.07%); split: -47.90%, +12.80% (sy)-stall: 147862 -> 117004 (-20.87%); split: -37.65%, +16.69% STPs: 38566 -> 2558 (-93.37%) LDPs: 5060 -> 3050 (-39.72%); split: -85.77%, +45.93% Cat0: 65593 -> 59487 (-9.31%); split: -13.42%, +4.15% Cat1: 19667 -> 63105 (+220.87%) Cat2: 155958 -> 157879 (+1.23%); split: -0.05%, +1.28% Cat6: 105228 -> 94910 (-9.81%); split: -12.36%, +2.54% Cat7: 2480 -> 2485 (+0.20%); split: -0.08%, +0.28% Subgroup size: 31872 -> 31744 (-0.40%) The primary impacted application from shader-db is gfxbench aztec ruins. A quick test of it showed no significant performance improvement (n=3). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37245>	2025-12-17 19:50:28 +00:00
Emma Anholt	0d9428736b	ir3/ra: Make a helper to get RA register pressure limits. I'll be reusing this to let vars_to_scratch keep bigger arrays in register space. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37245>	2025-12-17 19:50:28 +00:00
Emma Anholt	d5cb38e457	ir3: Move the compute shader threadsize forcing earlier. With this, we can look at real_wavesize while running NIR passes and know if we have to be doubled because of the shader info coming in. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37245>	2025-12-17 19:50:28 +00:00
Emma Anholt	5a09abe890	nir: Introduce nir_lower_vars_to_scratch_global(). This lets the driver make a more informed decision about which vars to lower to scratch based on the vars available to spill. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37245>	2025-12-17 19:50:28 +00:00
Emma Anholt	059d301c79	nir: Drop the mode argument of nir_lower_vars_to_scratch(). It only makes sense for function temps, and that's the only way it's been used. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37245>	2025-12-17 19:50:28 +00:00
Yiwei Zhang	962bed2dd6	vulkan: update ALLOWED_ANDROID_VERSION for api level 36 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38988>	2025-12-17 19:22:47 +00:00
Mel Henning	dfdaee5ca7	nak: Use the hardware's max warps_per_sm value This should improve our occupancy estimates. Reviewed-by: Mary Guillemard <mary@mary.zone> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38913>	2025-12-17 19:08:05 +00:00
Mel Henning	b154071178	nak: Don't box ShaderModelInfo Reviewed-by: Mary Guillemard <mary@mary.zone> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38913>	2025-12-17 19:08:05 +00:00
Mel Henning	d7e906d60e	nak: Replace &dyn ShaderModel w/ &ShaderModelInfo This is mostly a s/dyn ShaderModel/ShaderModelInfo/ with a few manual fixes. With this change, we now statically dispatch into ShaderModel, which is a bit faster than dynamically dispatching. Together, this commit and the last one improve compile times by about 1% geomean. Reviewed-by: Mary Guillemard <mary@mary.zone> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38913>	2025-12-17 19:08:04 +00:00
Mel Henning	ee65578fa1	nak: Add ShaderModelInfo which statically dispatches into the right ShaderModel implementation. Reviewed-by: Mary Guillemard <mary@mary.zone> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38913>	2025-12-17 19:08:04 +00:00
Ian Romanick	66fd4d72fd	nir/algebraic: Mask with shifted constant instead of shift-then-mask shader-db: All Intel platforms had similar results. (Lunar Lake shown) total instructions in shared programs: 17088766 -> 17088765 (<.01%) instructions in affected programs: 1375 -> 1374 (-0.07%) helped: 1 / HURT: 1 total cycles in shared programs: 887873068 -> 887871748 (<.01%) cycles in affected programs: 136402 -> 135082 (-0.97%) helped: 2 / HURT: 0 fossil-db: Lunar Lake Totals: Instrs: 924954240 -> 924939317 (-0.00%); split: -0.00%, +0.00% Subgroup size: 40937696 -> 40937728 (+0.00%) Cycle count: 106116946509 -> 106116637903 (-0.00%); split: -0.00%, +0.00% Spill count: 3423930 -> 3423250 (-0.02%); split: -0.02%, +0.00% Fill count: 4876960 -> 4876045 (-0.02%); split: -0.03%, +0.01% Max live registers: 193882457 -> 193881816 (-0.00%); split: -0.00%, +0.00% Max dispatch width: 49078640 -> 49078656 (+0.00%) Non SSA regs after NIR: 231314214 -> 231314219 (+0.00%); split: -0.00%, +0.00% Totals from 13809 (0.68% of 2019450) affected shaders: Instrs: 25433084 -> 25418161 (-0.06%); split: -0.08%, +0.02% Subgroup size: 32 -> 64 (+100.00%) Cycle count: 1483550606 -> 1483242000 (-0.02%); split: -0.27%, +0.25% Spill count: 41466 -> 40786 (-1.64%); split: -1.88%, +0.24% Fill count: 74195 -> 73280 (-1.23%); split: -2.12%, +0.88% Max live registers: 2326365 -> 2325724 (-0.03%); split: -0.05%, +0.02% Max dispatch width: 234848 -> 234864 (+0.01%) Non SSA regs after NIR: 3394104 -> 3394109 (+0.00%); split: -0.00%, +0.00% Meteor Lake and DG2 had similar results. (Meteor Lake shown) Totals: Instrs: 997527742 -> 997524495 (-0.00%); split: -0.00%, +0.00% Subgroup size: 27452928 -> 27452944 (+0.00%) Cycle count: 93646717070 -> 93649738060 (+0.00%); split: -0.00%, +0.01% Spill count: 3710125 -> 3709784 (-0.01%); split: -0.03%, +0.02% Fill count: 5032819 -> 5033191 (+0.01%); split: -0.04%, +0.05% Max live registers: 121648838 -> 121648528 (-0.00%); split: -0.00%, +0.00% Max dispatch width: 37811544 -> 37811584 (+0.00%) Non SSA regs after NIR: 255562054 -> 255565914 (+0.00%); split: -0.00%, +0.00% Totals from 14438 (0.63% of 2281134) affected shaders: Instrs: 25974222 -> 25970975 (-0.01%); split: -0.08%, +0.06% Subgroup size: 16 -> 32 (+100.00%) Cycle count: 1149710820 -> 1152731810 (+0.26%); split: -0.29%, +0.55% Spill count: 44445 -> 44104 (-0.77%); split: -2.23%, +1.46% Fill count: 76172 -> 76544 (+0.49%); split: -2.89%, +3.37% Max live registers: 1237997 -> 1237687 (-0.03%); split: -0.04%, +0.02% Max dispatch width: 123528 -> 123568 (+0.03%) Non SSA regs after NIR: 3490757 -> 3494617 (+0.11%); split: -0.03%, +0.14% Tiger Lake, Ice Lake, and Skylake had similar results. (Tiger Lake shown) Totals: Instrs: 1013364485 -> 1013342384 (-0.00%); split: -0.00%, +0.00% Cycle count: 85509342602 -> 85500105656 (-0.01%); split: -0.02%, +0.01% Spill count: 3903944 -> 3903350 (-0.02%); split: -0.02%, +0.01% Fill count: 6801948 -> 6799368 (-0.04%); split: -0.05%, +0.01% Max live registers: 122212165 -> 122211859 (-0.00%); split: -0.00%, +0.00% Max dispatch width: 37805336 -> 37805472 (+0.00%) Non SSA regs after NIR: 244624956 -> 244628603 (+0.00%); split: -0.00%, +0.00% Totals from 14835 (0.65% of 2278397) affected shaders: Instrs: 27522570 -> 27500469 (-0.08%); split: -0.10%, +0.02% Cycle count: 1128820972 -> 1119584026 (-0.82%); split: -1.53%, +0.71% Spill count: 46408 -> 45814 (-1.28%); split: -2.04%, +0.76% Fill count: 99071 -> 96491 (-2.60%); split: -3.14%, +0.54% Max live registers: 1287967 -> 1287661 (-0.02%); split: -0.04%, +0.02% Max dispatch width: 126600 -> 126736 (+0.11%) Non SSA regs after NIR: 3438628 -> 3442275 (+0.11%); split: -0.03%, +0.14% Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38979>	2025-12-17 18:38:55 +00:00
Tapani Pälli	2418c91537	anv/drirc: disable Xe2 CCS drm modifiers for GTK engine Cc: mesa-stable Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38373>	2025-12-17 17:34:09 +00:00
Connor Abbott	68c1a8230d	freedreno/crashdec: Fix crash with older kernels Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Older kernels lack the cluster-name property. Don't crash decoding devcoredumps from them, even if they can't be converted to snapshots properly. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38931>	2025-12-17 16:00:56 +00:00
Samuel Pitoiset	f8feed17e1	ac,radv,radeonsi: add tracked register macros to common code Because the tracked registers are really driver dependant, the driver is expected to handle the tracked_registers struct itself. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38740>	2025-12-17 15:09:26 +00:00
Samuel Pitoiset	c580fc667f	ac,radv: add ac_cmdbuf::context_roll and use it Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38740>	2025-12-17 15:09:26 +00:00
Samuel Pitoiset	f3b385859a	ac,radv: add more cmdbuf emit helpers Some can't be shared with RadeonSI because it uses templates in some places. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38740>	2025-12-17 15:09:25 +00:00
Samuel Pitoiset	b444dc145a	radv: remove redundant assertions in radeon_emit_{array}() The common helpers already have assertions. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38740>	2025-12-17 15:09:25 +00:00
Samuel Pitoiset	262fc80e45	ac,radv,radeonsi: add functions to initialize tracked regs Also initialize the new slots for RADV. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38740>	2025-12-17 15:09:25 +00:00
Samuel Pitoiset	eb2f4a13c4	radeonsi: remove dead code in si_set_tracked_regs_to_clear_state() GFX12 doesn't have clear state. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38740>	2025-12-17 15:09:24 +00:00
Samuel Pitoiset	44314e1ea6	ac,radv,radeonsi: add ac_tracked_regs Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38740>	2025-12-17 15:09:24 +00:00
Samuel Pitoiset	c97bd17d4d	radv: switch to AC_TRACKED_xxx Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38740>	2025-12-17 15:09:23 +00:00
Samuel Pitoiset	fad24d6fcc	ac/cmdbuf: add new slots to ac_tracked_reg For RADV registers that aren't tracked in RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38740>	2025-12-17 15:09:23 +00:00
Samuel Pitoiset	18bdb76408	ac,radeonsi: move si_tracked_reg to common code Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38740>	2025-12-17 15:09:22 +00:00
Icenowy Zheng	6bda88bfdb	pvr: copy WSI can_present_on_device function from PanVK Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Both PVR and PanVK are drivers for generic embedded GPU IP cores, so just take the can_present_on_device implementation from PanVK, which allows any platform devices for presentation. Signed-off-by: Icenowy Zheng <uwu@icenowy.me> Reviewed-by: Frank Binns <frank.binns@imgtec.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38985>	2025-12-17 14:53:39 +00:00
Martin Roukala (né Peres)	8b8e472c65	zink/ci: update the a750 expectations Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38977>	2025-12-17 14:10:32 +00:00
Martin Roukala (né Peres)	5f54ae9048	turnip/ci: update the vkd3d expectations Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38977>	2025-12-17 14:10:32 +00:00
Martin Roukala (né Peres)	f155711a33	freedreno/ci: update the a750 expectations Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38977>	2025-12-17 14:10:32 +00:00
Martin Roukala (né Peres)	6993b0172b	freedreno/ci/a750: switch to the linux-firmware-provided gpu fw Now that qcom has released the gpu firmware for the a750, let's stop using my fw package in favor of the publicly-available ones. v2: * Be more specific in the list of files we want to keep (lumag) * Uprev the linux firmware version * Use gfx-ci/firmware rather than the upstream gitlab repo Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38977>	2025-12-17 14:10:32 +00:00
Erik Faye-Lund	74b7b68628	mesa/st: always override internal-format for 10-bit formats We also need to do this in the GLES-only code-path, otherwise we'll end up setting PIPE_BIND_RENDER_TARGET for these, which means we'll incorrectly require these to be color-renderable. Fixes: `60e115dedf` ("mesa/st: do not drop binding prematurely") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38945>	2025-12-17 13:42:21 +00:00
Caterina Shablia	0da350f879	panvk: remove AFBC header zeroing Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This is not actually necessary and moreover was corrupting mipmapped arrayed 2D images in cases when the transition barrier wasn't transitioning all mips, but more than one layer. Keep the layout transition infrastructure in place as we'll need it for transaction elimination CRC zeroing on v10-. Fixes: `c95f8993` ("panvk: add a meta command for transitioning image layout") Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38972>	2025-12-17 12:33:58 +00:00
Caterina Shablia	d8ceb38ef1	panvk: do not access the image in image view's destructor Vulkan allows destroying an image without destroying the views of this image first. These views can not be used in any way and the only thing that the user can do with such a view is destroy it. This also means that the driver can not refer to the image inside the image view's destructor. Fixes `cb3f6481` ("panvk: Create MS shadow images and views") Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38972>	2025-12-17 12:33:58 +00:00
Samuel Pitoiset	bf2aa05b60	zink/ci: add two tests to the skip lists They either fails or hangs. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38714>	2025-12-17 11:11:18 +00:00
Samuel Pitoiset	5d76202b6d	radv: create descriptors for color/depth-stencil surfaces earlier For less CPU overhead when rendering begins and also because it's easy to pre-compute those descriptors. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38714>	2025-12-17 11:11:18 +00:00
Samuel Pitoiset	c8729cdd3c	radv/meta: stop passing a stencil attachment for depth decompress It should only be the depth aspect. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38714>	2025-12-17 11:11:18 +00:00
Samuel Pitoiset	43d7d97b13	radv/meta: inject image view usage info This will be used to initialize color/depth-stencil descriptors earlier when the image view is created. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38714>	2025-12-17 11:11:18 +00:00
Samuel Pitoiset	ce69cabb60	radv: constify radv_{cb,ds}_buffer_info parameters Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38714>	2025-12-17 11:11:18 +00:00
Lucas Fryzek	48799005d7	Revert "drisw: Copy entire buffer ignoring damage regions" Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This reverts commit `755e795e4c`. Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38817>	2025-12-17 10:06:32 +00:00
Lucas Fryzek	17ab0f2ece	drisw: Modify drisw_swap_buffers_with_damage to swap entire buffer When swapping buffer with damage regions, to be strictly correct we need to swap the entire back buffer to the front buffer. This needs to be done in case the compositor does not support damage regions. This means we need to ignore the input damage region and tell drisw to swap the entire buffer. Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38817>	2025-12-17 10:06:32 +00:00
Georg Lehmann	37c3a2fb89	zink/ci: update radv trace checksums Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38730>	2025-12-17 08:41:32 +00:00
Georg Lehmann	0478021fdc	aco/optimizer: reassociate rcp(mul(a, const)) into rcp_omod(a) Foz-DB Navi48: Totals from 2484 (2.54% of 97637) affected shaders: Instrs: 10368279 -> 10361892 (-0.06%); split: -0.06%, +0.00% CodeSize: 55161104 -> 55150752 (-0.02%); split: -0.02%, +0.00% SpillSGPRs: 14665 -> 14666 (+0.01%) Latency: 87694014 -> 87689324 (-0.01%); split: -0.01%, +0.00% InvThroughput: 16595764 -> 16594448 (-0.01%); split: -0.01%, +0.00% VClause: 209922 -> 209918 (-0.00%); split: -0.01%, +0.00% SClause: 205195 -> 205251 (+0.03%); split: -0.01%, +0.04% Copies: 843771 -> 843765 (-0.00%); split: -0.01%, +0.01% Branches: 275985 -> 275962 (-0.01%); split: -0.01%, +0.00% PreVGPRs: 170608 -> 170494 (-0.07%) VALU: 5840893 -> 5838038 (-0.05%); split: -0.05%, +0.00% SALU: 1481388 -> 1479037 (-0.16%); split: -0.16%, +0.00% VOPD: 7496 -> 7485 (-0.15%) Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38730>	2025-12-17 08:41:32 +00:00
Georg Lehmann	a8f5ced670	aco/optimizer: reassociate mul(mul(a, const), b) into mul_omod(a, b) Foz-DB Navi48: Totals from 14608 (14.96% of 97637) affected shaders: MaxWaves: 364201 -> 364421 (+0.06%) Instrs: 28051720 -> 28022503 (-0.10%); split: -0.13%, +0.03% CodeSize: 148938740 -> 148943480 (+0.00%); split: -0.04%, +0.04% VGPRs: 994520 -> 994004 (-0.05%); split: -0.05%, +0.00% SpillSGPRs: 45182 -> 45179 (-0.01%) Latency: 187734461 -> 187725301 (-0.00%); split: -0.07%, +0.06% InvThroughput: 33967002 -> 33949881 (-0.05%); split: -0.11%, +0.06% VClause: 495237 -> 495207 (-0.01%); split: -0.03%, +0.02% Copies: 2048324 -> 2047937 (-0.02%); split: -0.12%, +0.10% Branches: 598445 -> 598431 (-0.00%); split: -0.01%, +0.01% PreSGPRs: 877715 -> 877684 (-0.00%) PreVGPRs: 778146 -> 776383 (-0.23%); split: -0.23%, +0.00% VALU: 16413380 -> 16391508 (-0.13%); split: -0.15%, +0.01% SALU: 3685279 -> 3677655 (-0.21%); split: -0.23%, +0.02% VOPD: 26219 -> 25926 (-1.12%); split: +0.43%, -1.55% Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38730>	2025-12-17 08:41:31 +00:00
Daniel Schürmann	125ac1626d	radv: remove precomputed registers from radv_shader_binary Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details It is enough to compute them after upload. This saves some disk space and eliminates an unlikely bug where the shader cache is shared between two GPUs with the same chip but a different number of enabled CUs. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38970>	2025-12-17 08:16:06 +00:00
Sagar Ghuge	61287b00f3	anv: Stop using RCS companion for MSAA copy/clear on Xe3+ Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details On Xe3+, we have typed MSAA load/store message support. We can use them during MSAA copies. We don't have to fallback on RCS companion queue anymore. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33905>	2025-12-17 05:34:02 +00:00
Sagar Ghuge	de0c547448	blorp: Handle 2D MSAA array image copies on compute shader We are passing number of layers as inline parameter register, so figure out z_pos and write to 2D MSAA array images in compute shader. We already get component X, Y and sample index, all we needed was the number of layers. Ken: - Use load/store var instead of derefs Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33905>	2025-12-17 05:34:02 +00:00
Sagar Ghuge	080d28a03e	blorp: Set persample_msaa_dispatch for render shader Only 3D shader gets dispatched per sample not the compute shader. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33905>	2025-12-17 05:34:02 +00:00
Lionel Landwerlin	d99a3d9b58	anv: remove CS-L3 coherency on Xe2 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details I'll try to write some crucible tests for this. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `be5f5f659f` ("anv: consider CS coherent with L3 on Xe2+") Fixes: `503355c7f8` ("anv: update pipeline barriers for Xe2+") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38966>	2025-12-16 21:35:27 +00:00
Steev Klimaszewski	10f259e673	tu: Stop printing descriptor pool allocation failures The VK_ERROR_FRAGMENTED_POOL and VK_ERROR_OUT_OF_POOL_MEMORY errors are not as exceptional cases as most. These are expected to be hit by applications in the normal course of doing their thing. Probably best not to spam stderr and the debug logs with them. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38940>	2025-12-16 21:11:41 +00:00
Rob Clark	a520752328	freedreno/a6xx: gen8 lrz support Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38930>	2025-12-16 19:38:38 +00:00
Rob Clark	0e82a8d759	freedreno/a6xx: Fix layered lrz Don't hard-code to a single layer, and fix lrz (slow) clear path to account for the # of layers. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5582 Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38930>	2025-12-16 19:38:37 +00:00
Rob Clark	14a23e8b3e	freedreno/lrz: Add gen8 lrz layout support Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38930>	2025-12-16 19:38:37 +00:00

1 2 3 4 5 ...

216211 commits