fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-07 09:18:04 +02:00

Author	SHA1	Message	Date
Mike Blumenkrantz	0c17eadac0	zink: drop dt checks for mutable format init these are no longer applicable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23514>	2023-06-15 05:02:37 +00:00
Mike Blumenkrantz	9e83723a21	zink: add srgb mutable for all resources by default this should enable compression on more intermediate fb attachments it also means that VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT can now be set on images where ZINK_BIND_MUTABLE is not set, so non-resource APIs need to check ZINK_BIND_MUTABLE Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23514>	2023-06-15 05:02:37 +00:00
Mike Blumenkrantz	1859f191c3	zink: wrap format mismatch checks for blit/surface no functional changes Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23514>	2023-06-15 05:02:37 +00:00
Mike Blumenkrantz	5511a08a1c	zink: remove redundant conditional in set_sampler_views it's redundant, but it checks a different flag so it consumes cycles Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23514>	2023-06-15 05:02:37 +00:00
Caio Oliveira	26f456203c	compiler/types: Use hash table pre-hashed functions for type caching Calculate the hash outside the critical region, then use that both for search and insertion. Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23280>	2023-06-15 04:16:22 +00:00
Caio Oliveira	40ba00238b	compiler/types: Tidy up the asserts in get_*_instance functions Use the local variable in the assertions, move them out the critical region. No behavior change. Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23279>	2023-06-15 03:43:46 +00:00
Caio Oliveira	efbbdeffc0	compiler/types: Be consistent when naming array element/size The element type passed is different than the array type and it is not a "base type" in the glsl_type sense, so pick a name that reflects that. Also stick to a single name for the array_size. Just renames, no behavior change. Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23279>	2023-06-15 03:43:46 +00:00
Jesse Natalie	83f741124b	nir_lower_returns: Mark assert-only var as ASSERTED Fixes: `5d238c0c` ("nir_lower_returns: Optimize phis before beginning the pass") Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23634>	2023-06-15 03:09:29 +00:00
Dave Airlie	13df91d7d7	radv/video: restrict the number of IBs on video related queues. The hardware gets given a session context from userspace in each submission, but if the session context changes the hardware wants a FENCE to be emitted to know it can give up the current session. IF a test submits interleaved session ctx access and uses a single vulkan submit the hardware crashes, unless each IB is submitted in a separate submission so the fence can be sent. In theory it could be possible to construct a single command buffer to trigger this so I do think the hardware should be smarter here. Should this be fixed in the kernel to always emit a fence between IBs? Fixes: dEQP-VK.video.decode.h264_interleaved Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23641>	2023-06-15 02:49:00 +00:00
LingMan	0535948535	rusticl: fix UB in CLProp machinery Viewing structs as a collection of u8 is not generally sound. Any padding bytes might be uninitialized and creating an integer from uninitialized memory constitutes producing an invalid value, which is instant UB. Since we only copy these bytes around, the fix is to simply work with MaybeUninit<u8>, which can handle uninitialized memory just fine, instead. See: https://doc.rust-lang.org/reference/behavior-considered-undefined.html Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23652>	2023-06-15 02:31:19 +00:00
LingMan	fdcb86168d	rusticl: drop cl_prop_for_type macro There's no reason to differentiate between primitive types and structs here. `cl_prop_for_struct` can handle primitive types just fine. Drop `cl_prop_for_type` and rename the existing `cl_prop_for_struct` to `cl_prop_for_type`. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23652>	2023-06-15 02:31:19 +00:00
LingMan	cf43a74c79	rusticl: drop CLProp implementation for String Route the data to the implementation for &str instead. It works just as fine. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23652>	2023-06-15 02:31:19 +00:00
LingMan	f1461c5a77	rusticl: core: stop using cl_prop from the api module It's a layering violation and really the wrong tool for the job. Add a new fn to view a given slice as a &[u8] instead of going though the clprop machinery which creates a new Vec. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23652>	2023-06-15 02:31:18 +00:00
Charmaine Lee	2755519142	svga: fix compute shader type after ntt Reset compute shader type after ntt. Fixes: `0ac9541804` ("gallium: Drop PIPE_SHADER_CAP_PREFERRED_IR") Reviewed-by: Neha Bhende <bhenden@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23659>	2023-06-15 02:11:38 +00:00
Karol Herbst	095fee55f8	rusticl: enforce using unsafe blocks in unsafe functions Signed-off-by: Karol Herbst <git@karolherbst.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23660>	2023-06-15 01:53:11 +00:00
Mike Blumenkrantz	4edbe8f5a0	zink: add mem debugging modeled off turnip's debug infra, this adds debug printing for oom scenarios Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23653>	2023-06-15 01:31:24 +00:00
Mike Blumenkrantz	65fad783c7	zink: break out vk flag unrolling into util function Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23653>	2023-06-15 01:31:24 +00:00
Ian Romanick	de60b463d7	nir/algebraic: Simplify various trivial bfi These are mostly just obvious patterns that somebody will eventually want to add. DG2, Tiger Lake, Ice Lake, Skylake, Broadwell, and Haswell had similar results (Ice Lake shown) total instructions in shared programs: 20570033 -> 20570026 (<.01%) instructions in affected programs: 7363 -> 7356 (-0.10%) helped: 6 / HURT: 0 total cycles in shared programs: 902118781 -> 902118854 (<.01%) cycles in affected programs: 419132 -> 419205 (0.02%) helped: 4 / HURT: 2 DG2, Tiger Lake, Ice Lake, and Skylake had similar results (Ice Lake shown) Totals: Instrs: 152819500 -> 152819380 (-0.00%) Cycles: 15014627187 -> 15014624437 (-0.00%) Totals from 115 (0.02% of 662497) affected shaders: Instrs: 28963 -> 28843 (-0.41%) Cycles: 404582 -> 401832 (-0.68%) Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19968>	2023-06-14 18:49:53 +00:00
Ian Romanick	541e7eb389	nir/algebraic: Optimize some u2f of bfi v2: Fix a copy-and-paste bug s/('find_lsb', a)/a/ in the patterns. See piglit!819. DG2, Tiger Lake, Ice Lake, Skylake, and Broadwell had similar results (Ice Lake shown) total instructions in shared programs: 20570063 -> 20570033 (<.01%) instructions in affected programs: 452 -> 422 (-6.64%) helped: 30 / HURT: 0 total cycles in shared programs: 902118723 -> 902118781 (<.01%) cycles in affected programs: 1762 -> 1820 (3.29%) helped: 0 / HURT: 29 DG2, Tiger Lake, Ice Lake, and Skylake had similar results (Ice Lake shown) Totals: Instrs: 152819969 -> 152819500 (-0.00%) Cycles: 15014628652 -> 15014627187 (-0.00%); split: -0.00%, +0.00% Totals from 469 (0.07% of 662497) affected shaders: Instrs: 7644 -> 7175 (-6.14%) Cycles: 31787 -> 30322 (-4.61%); split: -4.90%, +0.29% Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19968>	2023-06-14 18:49:53 +00:00
Ian Romanick	96cde9cc01	intel/fs: Emit better code for bfi(..., 0) DG2, Tiger Lake, Ice Lake, and Skylake had similar results (Ice Lake shown) total instructions in shared programs: 20570141 -> 20570063 (<.01%) instructions in affected programs: 30679 -> 30601 (-0.25%) helped: 77 / HURT: 0 total cycles in shared programs: 902113977 -> 902118723 (<.01%) cycles in affected programs: 3255958 -> 3260704 (0.15%) helped: 60 / HURT: 19 Broadwell total instructions in shared programs: 18524633 -> 18524547 (<.01%) instructions in affected programs: 34095 -> 34009 (-0.25%) helped: 75 / HURT: 2 total cycles in shared programs: 949532394 -> 949543761 (<.01%) cycles in affected programs: 3419107 -> 3430474 (0.33%) helped: 57 / HURT: 24 total spills in shared programs: 22484 -> 22484 (0.00%) spills in affected programs: 516 -> 516 (0.00%) helped: 2 / HURT: 2 total fills in shared programs: 29346 -> 29338 (-0.03%) fills in affected programs: 572 -> 564 (-1.40%) helped: 4 / HURT: 0 Haswell total instructions in shared programs: 17331356 -> 17331523 (<.01%) instructions in affected programs: 27920 -> 28087 (0.60%) helped: 41 / HURT: 4 total cycles in shared programs: 936603192 -> 936574664 (<.01%) cycles in affected programs: 3417695 -> 3389167 (-0.83%) helped: 28 / HURT: 21 total spills in shared programs: 19718 -> 19756 (0.19%) spills in affected programs: 436 -> 474 (8.72%) helped: 0 / HURT: 4 total fills in shared programs: 22547 -> 22607 (0.27%) fills in affected programs: 444 -> 504 (13.51%) helped: 0 / HURT: 4 Ivy Bridge total cycles in shared programs: 463451277 -> 463451273 (<.01%) cycles in affected programs: 95870 -> 95866 (<.01%) helped: 3 / HURT: 2 DG2, Tiger Lake, Ice Lake, and Skylake had similar results (Ice Lake shown) Totals: Instrs: 152825278 -> 152819969 (-0.00%); split: -0.00%, +0.00% Cycles: 15014075626 -> 15014628652 (+0.00%); split: -0.01%, +0.01% Subgroup size: 8528536 -> 8528560 (+0.00%) Send messages: 7711431 -> 7711464 (+0.00%) Spill count: 99907 -> 99509 (-0.40%); split: -0.40%, +0.00% Fill count: 202459 -> 201598 (-0.43%); split: -0.43%, +0.00% Scratch Memory Size: 4376576 -> 4371456 (-0.12%) Totals from 2915 (0.44% of 662497) affected shaders: Instrs: 2288842 -> 2283533 (-0.23%); split: -0.24%, +0.01% Cycles: 471633295 -> 472186321 (+0.12%); split: -0.27%, +0.39% Subgroup size: 27488 -> 27512 (+0.09%) Send messages: 151344 -> 151377 (+0.02%) Spill count: 48091 -> 47693 (-0.83%); split: -0.83%, +0.00% Fill count: 59053 -> 58192 (-1.46%); split: -1.46%, +0.00% Scratch Memory Size: 1827840 -> 1822720 (-0.28%) Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19968>	2023-06-14 18:49:53 +00:00
Ian Romanick	6603948a7a	nir/algebraic: Lower some bfi with two constant sources All Haswell and newer Intel platforms had similar results. (Ice Lake shown) total instructions in shared programs: 19907054 -> 19906882 (<.01%) instructions in affected programs: 8103 -> 7931 (-2.12%) helped: 52 / HURT: 0 total cycles in shared programs: 855779334 -> 855781791 (<.01%) cycles in affected programs: 724201 -> 726658 (0.34%) helped: 38 / HURT: 7 total sends in shared programs: 1039308 -> 1039302 (<.01%) sends in affected programs: 162 -> 156 (-3.70%) helped: 2 / HURT: 0 No shader-db changes on any older Intel platforms. All Intel platforms had similar restuls. (Ice Lake shown) Totals: Instrs: 153117340 -> 152825222 (-0.19%); split: -0.19%, +0.00% Cycles: 15011904351 -> 15014072944 (+0.01%); split: -0.04%, +0.05% Send messages: 7711509 -> 7711421 (-0.00%) Spill count: 100745 -> 99907 (-0.83%); split: -0.85%, +0.02% Fill count: 203684 -> 202459 (-0.60%); split: -0.62%, +0.02% Scratch Memory Size: 4403200 -> 4376576 (-0.60%) Totals from 18603 (2.81% of 662496) affected shaders: Instrs: 5258303 -> 4966185 (-5.56%); split: -5.56%, +0.00% Cycles: 447391388 -> 449559981 (+0.48%); split: -1.29%, +1.77% Send messages: 559231 -> 559143 (-0.02%) Spill count: 5009 -> 4171 (-16.73%); split: -17.17%, +0.44% Fill count: 8769 -> 7544 (-13.97%); split: -14.33%, +0.36% Scratch Memory Size: 194560 -> 167936 (-13.68%) Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19968>	2023-06-14 18:49:53 +00:00
Ian Romanick	e419eefd34	intel/fs: Use nir_opt_reassociate_bfi All Skylake and newer Intel platforms had similar results. (Ice Lake shown) total instructions in shared programs: 19907072 -> 19907054 (<.01%) instructions in affected programs: 8859 -> 8841 (-0.20%) helped: 9 / HURT: 0 total cycles in shared programs: 855791238 -> 855779334 (<.01%) cycles in affected programs: 3308294 -> 3296390 (-0.36%) helped: 12 / HURT: 13 Broadwell total instructions in shared programs: 17818231 -> 17817440 (<.01%) instructions in affected programs: 9887 -> 9096 (-8.00%) helped: 9 / HURT: 0 total cycles in shared programs: 902970035 -> 902941221 (<.01%) cycles in affected programs: 2767243 -> 2738429 (-1.04%) helped: 14 / HURT: 5 total spills in shared programs: 17784 -> 17718 (-0.37%) spills in affected programs: 318 -> 252 (-20.75%) helped: 1 / HURT: 0 total fills in shared programs: 25458 -> 24949 (-2.00%) fills in affected programs: 1346 -> 837 (-37.82%) helped: 1 / HURT: 0 Haswell total instructions in shared programs: 16707799 -> 16707586 (<.01%) instructions in affected programs: 24049 -> 23836 (-0.89%) helped: 41 / HURT: 0 total cycles in shared programs: 882730648 -> 882723174 (<.01%) cycles in affected programs: 5096737 -> 5089263 (-0.15%) helped: 25 / HURT: 12 total spills in shared programs: 14937 -> 14909 (-0.19%) spills in affected programs: 436 -> 408 (-6.42%) helped: 4 / HURT: 0 total fills in shared programs: 17569 -> 17529 (-0.23%) fills in affected programs: 444 -> 404 (-9.01%) helped: 4 / HURT: 0 No shader-db changes on any older Intel platforms. All Intel platforms had similar results. (Ice Lake shown) Totals: Instrs: 153118594 -> 153117340 (-0.00%); split: -0.00%, +0.00% Cycles: 15011967556 -> 15011904351 (-0.00%); split: -0.00%, +0.00% Fill count: 203692 -> 203684 (-0.00%) Totals from 703 (0.11% of 662496) affected shaders: Instrs: 192826 -> 191572 (-0.65%); split: -0.65%, +0.00% Cycles: 29937640 -> 29874435 (-0.21%); split: -0.25%, +0.04% Fill count: 4146 -> 4138 (-0.19%) Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19968>	2023-06-14 18:49:53 +00:00
Ian Romanick	83bd87c558	nir: Add optimization pass to reassociate some bfi instructions The needs of this pass are ever so slightly more than what nir_opt_algebraic can do. :( Specifically, it needs to be able to look at the relationship of constant values used in an expression tree. v2: Add nir_mov_alu to handle swizzles on the original sources. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19968>	2023-06-14 18:49:53 +00:00
Mike Blumenkrantz	a085fead0c	zink: add some ci flakes Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23654>	2023-06-14 18:18:41 +00:00
Daniel Stone	2760aeb13e	CI: Re-enable freedreno CI Signed-off-by: Daniel Stone <daniels@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23108>	2023-06-14 17:39:29 +00:00
Daniel Stone	6af691dfff	ci: Extend a618_vk_full runtime Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23108>	2023-06-14 17:39:29 +00:00
Daniel Stone	c41d493f77	ci: Don't retry manual or scheduled jobs Only retry when there's some kind of non-job failure, such as runner-internal issues, or API/network issues, etc. If the job itself fails or times out, then given the length of these jobs, there's no point trying again and just tying up the job slots for even more hours. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23108>	2023-06-14 17:39:29 +00:00
Daniel Stone	47991a094e	ci: Elaborate causes for job retries Rather than always retrying, only retry jobs on a limited set of causes. This notably excludes retries when a job is stuck due to lack of runners to schedule it; if we can't get a slot on a runner in time, there's no reason to try again, since our window of opportunity has gone. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23108>	2023-06-14 17:39:29 +00:00
Emma Anholt	5ef4e1c4c0	ci: Drop some skips of GL CTS ArraysOfArrays tests. My hope is that with my CTS fix, we can complete these all in time now. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23610>	2023-06-14 16:45:23 +00:00
Emma Anholt	97744f11cf	ci: Drop skips for some previously-invalid CTS tests. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23610>	2023-06-14 16:45:23 +00:00
Emma Anholt	8c35537351	ci: Update to vulkan-cts-1.3.5.2 (and pull in some more fixes). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23610>	2023-06-14 16:45:23 +00:00
Emma Anholt	e3b0a79b3a	ci/zink: Update current xfails on tgl. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23610>	2023-06-14 16:45:23 +00:00
Emma Anholt	10b94772d2	intel: Reduce cost of resetting last_grf_write. In zink-on-anv fs-mod-dvec3-dvec3.shader_test, we were memsetting 2MB of last_grf_write 2400 times, multiple times through the scheduler. Just resetting for the processed instructions reduces runtime from 21s to 16s. No change on steam shader-db runtime across several runs. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23635>	2023-06-14 16:16:56 +00:00
Emma Anholt	7d4769e802	intel: Allocate the last_grf_write once per scheduler. No need to re-calloc it per block when we're going to use it again. Also, this fixes the vec4 backend to avoid allocating giant grf_count-sized arrays on the stack. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23635>	2023-06-14 16:16:56 +00:00
Emma Anholt	2ad865b219	intel: Count reads_remaining across all blocks. We were zeroing it out per block, but it doesn't actually help to count per block, since the question is "will scheduling this instruction free the reg?". Saves some memsetting, which was showing up high in the profile (but not from this source). No change on iris SKL shader-db. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23635>	2023-06-14 16:16:55 +00:00
Mike Blumenkrantz	12a47b84b7	egl/dri2: trigger drawable invalidation from surface queries for zink this mimics dri3 behavior and avoids scenarios where renderbuffers can get out of sync with their resources fixes #6744 Acked-by: Daniel Stone <daniels@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22858>	2023-06-14 15:38:21 +00:00
Mike Blumenkrantz	1563aea69f	lavapipe: add version uuid to shader binary validation this ensures compatible shader binaries across versions cc: mesa-stable Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23636>	2023-06-14 14:32:36 +00:00
Gert Wollny	b79f6ec397	r600: Disable SB if we use the ariable length DOT sb doesn't know about this instruction, so don't try to run the optimizer. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23647>	2023-06-14 13:14:19 +00:00
Gert Wollny	269895c674	600/sfn: Trigger use of ACK for some barriers Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23647>	2023-06-14 13:14:19 +00:00
Gert Wollny	d6280a8eef	r600/sfn: move kill handling to fully scheduling Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23647>	2023-06-14 13:14:19 +00:00
Gert Wollny	f7e6171f3a	r600: fix handling of use_sb flag The compiler will use the unsigned bit pattern of the check and combine this with the 1 bit, which will always result in use_sb to be zero. Fix this by making use_sb a bool Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23647>	2023-06-14 13:14:19 +00:00
Mike Blumenkrantz	4e87d81d20	zink: add a dgc debug mode for testing this is useful for drivers trying to implement DGC since there is no cts do not use. it will not make anything faster. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23550>	2023-06-14 12:37:24 +00:00
Lionel Landwerlin	6b9f838d62	intel/fs: handle load_global_constant_uniform_block_intel Again, load the data just once in GRF, share it across lanes. Shader-db on dg2: total instructions in shared programs: 23214555 -> 23215400 (<.01%) instructions in affected programs: 199977 -> 200822 (0.42%) helped: 3 HURT: 38 helped stats (abs) min: 5 max: 670 x̄: 283.67 x̃: 176 helped stats (rel) min: 1.34% max: 49.41% x̄: 22.15% x̃: 15.70% HURT stats (abs) min: 1 max: 185 x̄: 44.63 x̃: 32 HURT stats (rel) min: 0.13% max: 42.86% x̄: 10.25% x̃: 9.30% 95% mean confidence interval for instructions value: -18.65 59.87 95% mean confidence interval for instructions %-change: 3.29% 12.47% Inconclusive result (value mean confidence interval includes 0). total loops in shared programs: 5928 -> 5928 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 851137495 -> 851152449 (<.01%) cycles in affected programs: 16406137 -> 16421091 (0.09%) helped: 9 HURT: 32 helped stats (abs) min: 10 max: 13498 x̄: 6443.22 x̃: 5581 helped stats (rel) min: 0.11% max: 4.75% x̄: 1.45% x̃: 0.34% HURT stats (abs) min: 3 max: 15056 x̄: 2279.47 x̃: 735 HURT stats (rel) min: 0.10% max: 23.71% x̄: 4.58% x̃: 4.65% 95% mean confidence interval for cycles value: -1315.40 2044.87 95% mean confidence interval for cycles %-change: 1.71% 4.80% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 11856 -> 11825 (-0.26%) spills in affected programs: 2368 -> 2337 (-1.31%) helped: 4 HURT: 0 total fills in shared programs: 16258 -> 16207 (-0.31%) fills in affected programs: 2930 -> 2879 (-1.74%) helped: 4 HURT: 0 total sends in shared programs: 1038194 -> 1038185 (<.01%) sends in affected programs: 40 -> 31 (-22.50%) helped: 4 HURT: 0 helped stats (abs) min: 1 max: 4 x̄: 2.25 x̃: 2 helped stats (rel) min: 10.00% max: 33.33% x̄: 21.46% x̃: 21.25% 95% mean confidence interval for sends value: -4.64 0.14 95% mean confidence interval for sends %-change: -40.41% -2.51% Inconclusive result (value mean confidence interval includes 0). LOST: 0 GAINED: 0 Some VK/DX titles result (on DG2 only), it's mostly additional instruction counts except for the unity spaceship demo where a CS shader gets additional SIMDness. The reason for additional instructions is that since we're doing block loads, we need to find the live channels in control flow to select a single lane value that is valid. aztec_ruins_high: Totals from 3 (1.12% of 269) affected shaders: Instrs: 17732 -> 17896 (+0.92%) Cycles: 796518 -> 819302 (+2.86%) cyberpunk_2077: Totals from 17 (0.17% of 10301) affected shaders: Instrs: 10848 -> 11658 (+7.47%) Cycles: 248243 -> 259168 (+4.40%); split: -0.57%, +4.97% fallout_4_dxvk_g2: Totals from 2 (0.12% of 1638) affected shaders: Instrs: 3157 -> 3368 (+6.68%) Cycles: 487807 -> 490426 (+0.54%); split: -0.26%, +0.79% Max live registers: 139 -> 141 (+1.44%) red_dead_redemption2: Totals from 68 (1.14% of 5970) affected shaders: Instrs: 34871 -> 36486 (+4.63%) Cycles: 551430 -> 565211 (+2.50%) Send messages: 2074 -> 2072 (-0.10%) Max live registers: 5078 -> 5077 (-0.02%) total_war_warhammer2: Totals from 5 (1.05% of 478) affected shaders: Instrs: 6905 -> 6971 (+0.96%); split: -0.16%, +1.12% Cycles: 97035 -> 97989 (+0.98%); split: -0.07%, +1.05% unity spaceship demo (instruction count going up due to a CS shader bump from SIMD8->16): Totals from 53 (9.71% of 546) affected shaders: Instrs: 223748 -> 233223 (+4.23%); split: -0.01%, +4.25% Cycles: 23134697 -> 25207080 (+8.96%); split: -0.17%, +9.13% Subgroup size: 480 -> 488 (+1.67%) Spill count: 2156 -> 2242 (+3.99%); split: -0.19%, +4.17% Fill count: 4617 -> 4845 (+4.94%); split: -0.09%, +5.02% Max live registers: 5991 -> 6050 (+0.98%); split: -0.40%, +1.39% Max dispatch width: 480 -> 488 (+1.67%) witcher_3_dxvk_g2: Totals from 27 (2.51% of 1074) affected shaders: Instrs: 57067 -> 57677 (+1.07%); split: -0.03%, +1.10% Cycles: 1397871 -> 1436704 (+2.78%); split: -0.35%, +3.13% Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23477>	2023-06-14 12:04:05 +00:00
Lionel Landwerlin	4ee1a8bb9c	nir: add a load_global_constant uniform intel variant Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23477>	2023-06-14 12:04:05 +00:00
Lionel Landwerlin	5ae8a78d8c	intel/fs: make use of load_ubo_uniform_block_intel The principle is the same as the load_ssbo_uniform_block_intel. Whenever we see a uniform offset, load the data only once in GRFs to reduce register pressure. Iris shader-db run on DG2 : total instructions in shared programs: 23001325 -> 23094969 (0.41%) instructions in affected programs: 1775989 -> 1869633 (5.27%) helped: 764 HURT: 2097 helped stats (abs) min: 1 max: 102 x̄: 6.96 x̃: 2 helped stats (rel) min: 0.03% max: 16.91% x̄: 1.36% x̃: 0.63% HURT stats (abs) min: 1 max: 2461 x̄: 47.19 x̃: 7 HURT stats (rel) min: <.01% max: 199.34% x̄: 5.91% x̃: 2.60% 95% mean confidence interval for instructions value: 25.43 40.03 95% mean confidence interval for instructions %-change: 3.60% 4.33% Instructions are HURT. total loops in shared programs: 5847 -> 5847 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 839329852 -> 845491482 (0.73%) cycles in affected programs: 130229434 -> 136391064 (4.73%) helped: 1098 HURT: 2228 helped stats (abs) min: 1 max: 130102 x̄: 1340.64 x̃: 22 helped stats (rel) min: <.01% max: 64.25% x̄: 4.03% x̃: 0.71% HURT stats (abs) min: 1 max: 185309 x̄: 3426.24 x̃: 87 HURT stats (rel) min: <.01% max: 92.85% x̄: 8.12% x̃: 3.82% 95% mean confidence interval for cycles value: 1342.16 2362.97 95% mean confidence interval for cycles %-change: 3.70% 4.52% Cycles are HURT. total spills in shared programs: 10768 -> 11856 (10.10%) spills in affected programs: 9717 -> 10805 (11.20%) helped: 25 HURT: 28 total fills in shared programs: 13720 -> 16258 (18.50%) fills in affected programs: 12016 -> 14554 (21.12%) helped: 25 HURT: 28 total sends in shared programs: 1034790 -> 1031266 (-0.34%) sends in affected programs: 33416 -> 29892 (-10.55%) helped: 1005 HURT: 0 helped stats (abs) min: 1 max: 22 x̄: 3.51 x̃: 3 helped stats (rel) min: 1.69% max: 60.00% x̄: 15.20% x̃: 14.08% 95% mean confidence interval for sends value: -3.72 -3.29 95% mean confidence interval for sends %-change: -15.82% -14.57% Sends are helped. LOST: 26 GAINED: 183 shader-db on a number of VK/DX titles on DG2 : PERCENTAGE DELTAS Shaders Instrs Cycles age_of_wonders_III 1928 +0.02% -0.19% PERCENTAGE DELTAS Shaders Instrs Cycles Subgroup size Send messages Spill count Fill count Max live registers Max dispatch width assassins_creed_odyssey 2119 +1.12% -0.42% -0.03% -0.29% -9.10% -4.26% -0.64% +0.65% PERCENTAGE DELTAS Shaders Instrs Cycles Spill count Fill count Max live registers aztec_ruins_high 269 -0.05% -0.45% -0.29% -7.27% -0.33% PERCENTAGE DELTAS Shaders Instrs Cycles Max live registers Max dispatch width dark_souls_3_dxvk_g2 1420 +0.09% +0.24% +0.21% +0.12% (stats look bad, but it's just one shader affected) PERCENTAGE DELTAS Shaders Instrs Cycles Spill count Fill count Scratch Memory Size Max live registers fallout_4_dxvk_g2 1638 +0.67% +8.32% +16.02% +7.17% +100.00% +0.48% PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Spill count Fill count Max live registers Max dispatch width red_dead_redemption2 5969 +0.16% -0.04% -0.04% +0.01% +0.05% -0.20% +0.04% PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Max live registers Max dispatch width rise_of_the_tomb_raider_g2 12129 +2.19% +1.36% -1.23% -0.36% +2.04% PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Max live registers shooter-game 693 +0.07% -0.89% -0.09% -0.09% PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Max live registers Max dispatch width talos_g2 1140 +0.37% +3.80% -0.86% -0.67% +0.19% PERCENTAGE DELTAS Shaders Instrs Cycles Max live registers Max dispatch width total_war_warhammer2 477 +0.25% +0.66% -0.17% +0.10% PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Max live registers Max dispatch width witcher_3_dxvk_g2 1074 +0.75% -10.45% -0.15% -0.16% -0.16% PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Max live registers wolfenstein_youngblood 1111 +0.52% +0.66% -0.59% -0.03% Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23477>	2023-06-14 12:04:05 +00:00
Lionel Landwerlin	4a23a5a904	nir: add a new ubo uniform loading intrinsic for intel Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23477>	2023-06-14 12:04:05 +00:00
Lionel Landwerlin	7eb1e2a690	intel/fs: avoid reusing the VGRF for uniform load_ubo Only found 3 shaders affected in Red Dead Redemption : Totals from 3 (0.05% of 5969) affected shaders: Instrs: 2246 -> 2230 (-0.71%) Cycles: 156506 -> 148402 (-5.18%); split: -5.23%, +0.05% This will have a larger effect when we add the load_ubo_uniform_block_intel intrinsic where we will have larger blocks (vec8/vec16 vs vec4 only now). Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23477>	2023-06-14 12:04:05 +00:00
Lionel Landwerlin	ff3494fce3	intel/fs: print identation for control flow INTEL_DEBUG=optimizer output changes from : { 10} 40: cmp.nz.f0.0(8) null:F, vgrf3470:F, 0f { 10} 41: (+f0.0) if(8) (null):UD, { 11} 42: txf_logical(8) vgrf3473:UD, vgrf250:D(null):UD, 0d(null):UD(null):UD(null):UD(null):UD, 31u, 0u(null):UD(null):UD(null):UD, 3d, 0d { 12} 43: and(8) vgrf262:UD, vgrf3473:UD, 2u { 11} 44: cmp.nz.f0.0(8) null:D, vgrf262:D, 0d { 10} 45: (+f0.0) if(8) (null):UD, { 11} 46: mov(8) vgrf270:D, -1082130432d { 12} 47: mov(8) vgrf271:D, 1082130432d { 14} 48: mov(8) vgrf274+0.0:D, 0d { 14} 49: mov(8) vgrf274+1.0:D, 0d to : { 10} 40: cmp.nz.f0.0(8) null:F, vgrf3470:F, 0f { 10} 41: (+f0.0) if(8) (null):UD, { 11} 42: txf_logical(8) vgrf3473:UD, vgrf250:D(null):UD, 0d(null):UD(null):UD(null):UD(null):UD, 31u, 0u(null):UD(null):UD(null):UD, 3d, 0d { 12} 43: and(8) vgrf262:UD, vgrf3473:UD, 2u { 11} 44: cmp.nz.f0.0(8) null:D, vgrf262:D, 0d { 10} 45: (+f0.0) if(8) (null):UD, { 11} 46: mov(8) vgrf270:D, -1082130432d { 12} 47: mov(8) vgrf271:D, 1082130432d { 14} 48: mov(8) vgrf274+0.0:D, 0d { 14} 49: mov(8) vgrf274+1.0:D, 0d Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23477>	2023-06-14 12:04:05 +00:00
Karol Herbst	5b3ff7e3f3	rusticl/queue: overhaul of the queue+event handling This new approach handles things as follows: 1. Fences won't be attached to events anymore, applications only wait on the cv attached to the event. 2. Only the queue is allowed to update event status for non user events. This will eliminate all remaining status updating races between the queue and applications waiting on events. 3. Queue minimized flushing by bundling events 4. Increase cv wait timeout as there is really no point in waking up too often. Reduces amount of emited fences on radeonsi in luxmark 3.1 luxball by 90% Signed-off-by: Karol Herbst <git@karolherbst.de> Reviewed by Nora Allen <blackcatgames@protonmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23612>	2023-06-14 11:14:46 +00:00
Iago Toral Quiroga	6114e66124	broadcom/compiler: only use last thread switch flag to detect final section Since commit 'c98ddc778a3 broadcom/compiler: force a last thrsw for spilling' we always ensure we signal the last thread section explicitly with a last thread switch. Relying on VPM stores to detect the last thread section is particularly bad, because we can have VPM stores occurring quite early in a shader program, which would disable TMU spilling almost entirely. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22461>	2023-06-14 09:27:50 +00:00

... 4 5 6 7 8 ...

173097 commits