fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-24 02:20:11 +01:00

Author	SHA1	Message	Date
Daniel Schürmann	acb47d2c78	nir/load_store_vectorize: also parse offsets through u2u64 if additions don't wrap around Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37163>	2025-09-08 09:56:03 +00:00
Sergi Blanch-Torne	084add9959	ci: disable Collabora's farm due to maintenance Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Planned downtime in the farm: * Start: 2025-09-08 07:00 UTC * End: 2025-09-08 13:00 UTC Signed-off-by: Sergi Blanch-Torne <sergi.blanch.torne@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36752>	2025-09-08 07:03:33 +02:00
Timothy Arceri	891d46f517	st/glsl_to_nir: dont add duplicate state tokens Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This avoids adding duplicates that also won't be optimised as the param optimise path is also skipped for variants. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13735 Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37037>	2025-09-07 23:13:25 +00:00
Timothy Arceri	5231bbe32e	st/glsl: set driver loc after lowering clipplane We need to store the driver location when we add it to the state param list. Currently this code only works because st_nir_assign_uniform_locations() later adds duplicate params but this will be fixed in a following patch. Unfortunatly we could not simply move nir_lower_clip to the state tracker like we did in the reset of this MR as it is also called directly from some drivers. Also to avoid making nir depend on the gl_parameter defintions we simply loop over the results of the lowering and fix up the locations. Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37037>	2025-09-07 23:13:25 +00:00
Timothy Arceri	265b878f80	st/glsl: set driver location in nir_lower_point_size_mov() Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37037>	2025-09-07 23:13:25 +00:00
Timothy Arceri	8b1d48cf0b	nir: move nir_lower_point_size_mov() to st Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37037>	2025-09-07 23:13:24 +00:00
Timothy Arceri	a73dab0af8	st/glsl: set driver location in nir_lower_alpha_test() This previously worked because the driver locations would later be set when st_nir_assign_uniform_locations() was called for a second time but we will be skipping the extra call in a later patch. Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37037>	2025-09-07 23:13:24 +00:00
Timothy Arceri	450419c3f4	nir: move nir_lower_alpha_test() to the st This is gl specific and a following fix will add more gl specific params so here we move it to the st to avoid filling nir.h with more junk. Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37037>	2025-09-07 23:13:23 +00:00
Timothy Arceri	3d9a5ee95d	st/glsl: set driver locations in nir_lower_drawpixels() This previously worked because the driver locations would later be set when st_nir_assign_uniform_locations() was called for a second time but we will be skipping the extra call in a later patch. Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37037>	2025-09-07 23:13:23 +00:00
Timothy Arceri	8417f4a8eb	nir: move nir_lower_drawpixels() to the state tracker This is gl specific and a following fix will add more gl specific params so here we move it to the st to avoid filling nir.h with more junk. Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37037>	2025-09-07 23:13:22 +00:00
Timothy Arceri	109040a4b5	st/glsl: fix nir_lower_position_invariant() We need to store the driver location when we add it to the state param list. Currently this code only works because st_nir_assign_uniform_locations() later adds duplicate params but this will be fixed in a following patch. Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37037>	2025-09-07 23:13:22 +00:00
Timothy Arceri	79f0060618	st/glsl: fix packed uniform handling in st_nir_lower_fog() This previously worked because the driver locations would later be overwritten when st_nir_assign_uniform_locations() was called for a second time but we will be skipping the extra call in a later patch. Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37037>	2025-09-07 23:13:21 +00:00
Timothy Arceri	3da857d7c5	st/glsl: encapsulate more in st_nir_state_variable_create() This will allow us to fix bugs with driver_location in following patches while keeping code to a minimum. For now we always set uniform packing to false, this will be corrected as we go in following patches for now it doesn't matter as the driver location is later overwritten. Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37037>	2025-09-07 23:13:21 +00:00
Eric Engestrom	415db01738	venus/ci: document fixed tests Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37217>	2025-09-07 22:25:59 +02:00
Eric Engestrom	7d56f83875	zink+turnip/ci: document fixed tests Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37217>	2025-09-07 22:25:59 +02:00
Eric Engestrom	0cfc3429fc	zink+nvk/ci: document fixed tests Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37217>	2025-09-07 22:25:58 +02:00
Eric Engestrom	a5fd6fce4c	nvk/ci: document fixed tests Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37217>	2025-09-07 22:25:58 +02:00
Eric Engestrom	e0adaae78a	r300/ci: document fixed tests Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37217>	2025-09-07 22:24:31 +02:00
Eric Engestrom	ff791ab7a9	etnaviv/ci: document fixed tests Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37217>	2025-09-07 22:24:22 +02:00
Christoph Neuhauser	2f8b8649f0	iris: Increase max_shader_buffer_size to max_buffer_size Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This commit increases max_shader_buffer_size to max_buffer_size for Iris. Signed-off-by: Christoph Neuhauser <christoph.neuhauser@intel.com> Co-authored-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37182>	2025-09-07 16:17:10 +00:00
Caio Oliveira	62815cc91f	util: Avoid invalid access in ralloc_print_info() Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Check if allocation is large enough to hold the linear and gc contexts before probing for them. Fixes: `7b5b164281` ("util: Add function print information about a ralloc tree") Acked-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37017>	2025-09-06 20:28:34 +00:00
Caio Oliveira	f37c9c873c	brw: Fix printing of blocks in disassembly when BRW is available Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details When disassembling and BRW IR is available (which happens in the generator), there will be pointers to the BRW's basic block structures that are used to print the block numbers and predecessor/successors in the output. There are two challenges: - Because DO and FLOW instructions are not real instructions, they are not emitted in the output but would still cause the output to contain empty blocks. Previous code accounted for DO but still had problems. - DO blocks have special physical links that don't make sense when the DO is not emitted at the end, but they would be shown even if that block was omitted. These issues can be seen here (edited to remove non-essential bits) ``` START B0 (2 cycles) mov(8) g126<1>UD 0x3f800000UD END B0 ->B1 START B2 <-B1 <-B4 (0 cycles) END B2 ->B3 START B3 <-B2 (260 cycles) LABEL1: mov(8) g1<1>D 0D cmp.ge.f0.0(8) null<1>D g2<0,1,0>D 10D sync nop(1) null<0,1,0>UB send(1) g0UD g1UD nullUD (+f0.0) break(8) JIP: LABEL0 UIP: LABEL0 END B3 ->B1 ->B5 ->B4 START B4 <-B3 (1000 cycles) sync nop(1) null<0,1,0>UB mov(8) g126<1>UD g0<0,1,0>UD LABEL0: while(8) JIP: LABEL1 END B4 ->B2 START B5 <-B1 <-B3 (20 cycles) ``` For example: - Block 1 is missing (a skipped DO block) - Block 2 is empty (it was a FLOW block) - Block 3 ends with a link to Block 1 (the special links involving DO blocks). Two key changes were made to fix this. First, skip the DO and FLOW blocks completely. The use_tail ensures that the instruction group is reused to avoid empty blocks. Second, when printing, the successors and predecessors, walk through the skipped blocks. And finally, don't print the special blocks. With the fix, here's the output. Note the blocks retain their original BRW IR number. ``` START B0 (2 cycles) mov(8) g127<1>UD 0x3f800000UD END B0 ->B3 START B3 <-B0 <-B4 (260 cycles) LABEL1: mov(8) g1<1>D 0D cmp.ge.f0.0(8) null<1>D g2<0,1,0>D 10D sync nop(1) null<0,1,0>UB send(1) g0UD g1UD nullUD (+f0.0) break(8) JIP: LABEL0 UIP: LABEL0 END B3 ->B5 ->B4 START B4 <-B3 (1000 cycles) sync nop(1) null<0,1,0>UB mov(8) g127<1>UD g0<0,1,0>UD LABEL0: while(8) JIP: LABEL1 END B4 ->B3 START B5 <-B3 (20 cycles) ``` Issue was spotted by Ken. Fixes: `d2c39b1779` ("intel/brw: Always have a (non-DO) block after a DO in the CFG") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36226>	2025-09-06 16:42:05 +00:00
Daniel Schürmann	c78f1d516c	nir/algebraic: add pattern for (a << #b) * #c => a * (#c << #b) Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Totals from 2545 (3.19% of 79839) affected shaders: (Navi48) Instrs: 6371003 -> 6364130 (-0.11%); split: -0.12%, +0.01% CodeSize: 33827548 -> 33812244 (-0.05%); split: -0.06%, +0.01% Latency: 47451755 -> 47430108 (-0.05%); split: -0.05%, +0.00% InvThroughput: 10442450 -> 10437159 (-0.05%); split: -0.05%, +0.00% SClause: 159829 -> 159874 (+0.03%); split: -0.01%, +0.04% Copies: 500725 -> 500721 (-0.00%); split: -0.01%, +0.01% PreSGPRs: 110482 -> 110478 (-0.00%); split: -0.00%, +0.00% PreVGPRs: 147289 -> 147287 (-0.00%); split: -0.00%, +0.00% VALU: 3456135 -> 3454241 (-0.05%); split: -0.06%, +0.01% SALU: 925982 -> 923616 (-0.26%) VOPD: 1243 -> 1212 (-2.49%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37173>	2025-09-06 10:18:42 +00:00
Georg Lehmann	87f451aa39	intel/ci: update restricted trace checksums Caused by https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37113 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37211>	2025-09-06 11:59:16 +02:00
Georg Lehmann	f47e4fee4c	mesa: clamp fog scale to -FLT_MAX instead of FLT_MIN Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details FLT_MIN is the smallest positive float, not the smallest negative float. Fixes: `35ae5dce39` ("mesa: don't pass Infs to the shader via gl_Fog.scale") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11412 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37204>	2025-09-06 07:20:31 +00:00
Yonggang Luo	885323ea3a	tgsi/nir: Handling TGSI_OPCODE_RET in tgsi_to_nir Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11554 The nir_push_if is needed as more instructions will added after `RET`. Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37170>	2025-09-06 01:34:44 +00:00
Faith Ekstrand	c2a9a33f75	nvk: Use Vulkan formats for SET_ZT_FORMAT instead of NIL Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details The Vulkan format is actually depth/stencil while the NIL format sometimes has the stencil swapped for X. Fixes: `89110b8d1d` ("nvk: Use the image format for depth views") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37208>	2025-09-06 00:26:48 +00:00
Emma Anholt	29fb897c0a	ir3: Enable nir_opt_shrink_shrink_vec_array_vars. Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details The effect is surprisingly big, though it does seem to be concentrated in just a few apps (Batman: Arkham Origins, Metro 2033 Redux, Shadow Warrior): Totals: MaxWaves: 19680240 -> 19788620 (+0.55%); split: +0.55%, -0.00% Instrs: 369291159 -> 367831500 (-0.40%); split: -0.40%, +0.01% CodeSize: 936669580 -> 933798912 (-0.31%); split: -0.31%, +0.00% ... Totals from 16918 (1.21% of 1402199) affected shaders: MaxWaves: 125724 -> 234104 (+86.20%); split: +86.83%, -0.63% Instrs: 11328230 -> 9868571 (-12.89%); split: -13.13%, +0.25% CodeSize: 23684238 -> 20813570 (-12.12%); split: -12.24%, +0.12% NOPs: 1633346 -> 1640119 (+0.41%); split: -2.09%, +2.50% MOVs: 1940036 -> 510016 (-73.71%); split: -75.07%, +1.36% COVs: 188107 -> 188546 (+0.23%); split: -0.32%, +0.56% Full: 454239 -> 263078 (-42.08%); split: -42.80%, +0.71% (ss): 251004 -> 231443 (-7.79%); split: -9.81%, +2.01% (sy): 116086 -> 115153 (-0.80%); split: -2.38%, +1.58% (ss)-stall: 738920 -> 794215 (+7.48%); split: -7.13%, +14.62% (sy)-stall: 3321071 -> 3193717 (-3.83%); split: -5.58%, +1.74% STPs: 101880 -> 71523 (-29.80%) LDPs: 17406 -> 14411 (-17.21%) Preamble Instrs: 2519390 -> 2548205 (+1.14%); split: -0.31%, +1.46% Subgroup size: 1097472 -> 1097920 (+0.04%) Cat0: 1833041 -> 1839613 (+0.36%); split: -1.91%, +2.27% Cat1: 2128393 -> 698894 (-67.16%); split: -68.42%, +1.26% Cat2: 3602449 -> `3595086` (-0.20%); split: -0.24%, +0.03% Cat3: 2817384 -> 2815410 (-0.07%); split: -0.08%, +0.01% Cat4: 273682 -> 273655 (-0.01%) Cat5: 304630 -> 304398 (-0.08%) Cat6: 207434 -> 179648 (-13.40%); split: -13.70%, +0.31% Cat7: 161217 -> 161867 (+0.40%); split: -1.25%, +1.65% Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37191>	2025-09-06 00:03:12 +00:00
Emma Anholt	b353f868dc	ir3: Enable nir_opt_shrink_stores. This pass strips trailing components not in the writemask of store intrinsics, or from the trailing components that aren't part of an image's format. Totals from 11641 (0.83% of 1402199) affected shaders: MaxWaves: 159402 -> 159422 (+0.01%); split: +0.08%, -0.07% Instrs: 3073536 -> 3064117 (-0.31%); split: -0.59%, +0.28% CodeSize: 7529906 -> 7417398 (-1.49%); split: -1.54%, +0.04% NOPs: 286665 -> 289623 (+1.03%); split: -2.71%, +3.74% MOVs: 85466 -> 74849 (-12.42%); split: -14.28%, +1.86% Full: 116869 -> 116557 (-0.27%); split: -0.35%, +0.09% (ss): 68245 -> 65758 (-3.64%); split: -5.23%, +1.59% (sy): 31673 -> 31812 (+0.44%); split: -0.75%, +1.19% (ss)-stall: 160473 -> 161653 (+0.74%); split: -3.63%, +4.37% (sy)-stall: 668624 -> 668566 (-0.01%); split: -2.82%, +2.81% Preamble Instrs: 1059243 -> 1033109 (-2.47%); split: -2.47%, +0.00% Early Preamble: 10550 -> 10530 (-0.19%) Subgroup size: 1172672 -> 1172416 (-0.02%); split: +0.01%, -0.03% Cat0: 323161 -> 326364 (+0.99%); split: -2.50%, +3.49% Cat1: 156177 -> 145280 (-6.98%); split: -7.92%, +0.95% Cat2: 1448974 -> 1448964 (-0.00%) Cat3: 874169 -> 874175 (+0.00%) Cat5: 75743 -> 75742 (-0.00%) Cat7: 38702 -> 36982 (-4.44%); split: -5.80%, +1.35% Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37191>	2025-09-06 00:03:12 +00:00
Faith Ekstrand	baeb070a94	nvk: Stop adding Vulkan image usage flags The sampled and color attachment bits don't actually affect image layout in any meaningful way. They just cause us to create extra descriptors in cases where we may not need them. However, now that meta always sets view usage, we always create the usages meta needs, even if the client doesn't request them. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36957>	2025-09-05 23:34:15 +00:00
Faith Ekstrand	446d5ef103	vulkan: Drop the driver_internal from vk_image_view_init/create() It alwways comes in through the create flags now. Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36957>	2025-09-05 23:34:14 +00:00
Faith Ekstrand	d1ef8647ac	v3dv: Use VK_IMAGE_VIEW_CREATE_DRIVER_INTERNAL_BIT_MESA Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36957>	2025-09-05 23:34:13 +00:00
Faith Ekstrand	1897d5d9c9	radv: Use VK_IMAGE_VIEW_CREATE_DRIVER_INTERNAL_BIT_MESA This does mean having to set the flag everywhere, which is a bit annoying, but I don't think I missed any. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36957>	2025-09-05 23:34:12 +00:00
Faith Ekstrand	4eb098a6f1	nvk: Use VK_IMAGE_VIEW_CREATE_DRIVER_INTERNAL_BIT_MESA Instead of having our own nvk_image_view_init() which passes through a boolean, just set the create flag. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36957>	2025-09-05 23:34:12 +00:00
Faith Ekstrand	42abf00f2b	vulkan: Handle VK_IMAGE_VIEW_CREATE_DRIVER_INTERNAL_BIT_MESA automatically This moves the bit into vk_image.h and handles it automatically in vk_image_view_init() so drivers don't have to. This also means that Meta is now hitting the driver_internal path for all its images so we need to do the same format fixups there that we sould normally do on the !driver_internal path. We don't want to do them unconditionally because v3dv and other drivers override depth/stencil color formats and we don't want to break that. Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36957>	2025-09-05 23:34:11 +00:00
Faith Ekstrand	e7b0cbdf40	vulkan/meta: Always set VK_IMAGE_VIEW_CREATE_DRIVER_INTERNAL_BIT_MESA Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36957>	2025-09-05 23:34:11 +00:00
Faith Ekstrand	89110b8d1d	nvk: Use the image format for depth views Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36957>	2025-09-05 23:34:11 +00:00
Connor Abbott	7527ad001a	tu: Lower ViewIndex to 0 when multiview is disabled This is an optimization, but it also seems to be required because the HW sometimes fails to set ViewIndex to 0. This fixes flakes with dEQP-VK.renderpass2.fragment_density_map.*multiviewport where the VS for the main renderpass is reused for the copy renderpass afterwards and it copies ViewIndex to ViewportIndex expecting it to be 0 since multiview is disabled for the copy renderpass. Closes: #13534 Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37206>	2025-09-05 22:17:39 +00:00
Karol Herbst	5bb463bb48	nak/qmd: properly set target shared mem size Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details The shared memory settings in the QMD affect occupancy as the hardware needs to manage the available shared memory across all workgroups. We should set the target to the amount of shared memory used by all the blocks that can run concurrently taking GPR usage and the local size into account. E.g. a shader using 88 gprs, 256 threads and a shared memory size of 18944 can have 2 blocks running concurrently, therefore on an Ampere we need to set the target to 64kB to properly utilize the hardware. Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37135>	2025-09-05 20:01:02 +00:00
Karol Herbst	a0131b53ad	nvk: use hardware limits for maxComputeSharedMemorySize It doesn't change the reported values, but it will allow us to easily advertise real hardware limits in the future. Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37135>	2025-09-05 20:01:02 +00:00
Karol Herbst	1d5a1b11db	nak/qmd: base shared mem size allocation on hardware limits We can allocate more than 48k of shared memory, but the limits differ across hardware, so we need to take it all into account to create the shared memory splits the hardware can accept. This does change behavior on Turing, but the assumption is, that the hardware has simply rounded up. Might need performance testing on Turing to verify nothing regresses here. Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37135>	2025-09-05 20:01:02 +00:00
Karol Herbst	b09deba713	nouveau/winsys: add shared memory size tables It's a bit of a disaster, but each generation supports a different set of shared memory configurations. Knowing the maximum is important for compute shader performance, knowing all the legal sizes for QMD generation. Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37135>	2025-09-05 20:01:01 +00:00
Karol Herbst	3c9fa18069	nvk: prepare for higher shared memory sizes On hw we have up to 228k of available Shared memory so a 16 bit int isn't enough for that. Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37135>	2025-09-05 20:01:01 +00:00
Karol Herbst	083a3dc545	util: move typed_memcpy into macros.h Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37135>	2025-09-05 20:01:00 +00:00
Mel Henning	1c764357e8	nvk: Only copy 32-bits for cond render operand A Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Now that we're guaranteed the upper 32 bits are zero initialized, there's no reason we need to do a 64-bit write here. This is a 0.3% performance improvement on the Sascha Willems conditionalrender demo with all rendering disabled (638 fps -> 640 fps) Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37187>	2025-09-05 18:57:33 +00:00
Mel Henning	4d8e2f7768	nvk: Don't re-initialize cond rendering operand B We can initialize this just once from the CPU side instead of overwriting it each time using the copy engine. This is a 5% performance improvement on the Sascha Willems conditionalrender demo with all rendering disabled (607 fps -> 638 fps) Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37187>	2025-09-05 18:57:33 +00:00
Mel Henning	966a1b5380	nvk: Reuse the same cond render temp in a cmd_buf Within a single command buffer, we know that our operations will happen sequentially so we don't need to allocate a unique address per vkCmdBeginConditionalRenderingEXT - we can re-use the same address instead. Improves perf on the Sascha Willems conditionalrender demo with all rendering disabled by about 2% (595 fps -> 607 fps) Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37187>	2025-09-05 18:57:33 +00:00
Mel Henning	64b4e52755	nvk: Move cond rendering memory out of gart This is a 41% performance improvement on the Sascha Willems conditionalrender demo with all rendering disabled (422 fps -> 595 fps) Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37187>	2025-09-05 18:57:32 +00:00
Mel Henning	0b43a625f4	nvk: Remove gart from the name of cond_render_mem Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37187>	2025-09-05 18:57:32 +00:00
Connor Abbott	a89f897870	freedreno/ci: Add a750 sparse skips Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>	2025-09-05 16:58:09 +00:00

1 2 3 4 5 ...

211636 commits