Caio Oliveira
40ba00238b
compiler/types: Tidy up the asserts in get_*_instance functions
...
Use the local variable in the assertions, move them out the critical region.
No behavior change.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23279 >
2023-06-15 03:43:46 +00:00
Caio Oliveira
efbbdeffc0
compiler/types: Be consistent when naming array element/size
...
The element type passed is different than the array type and it is not
a "base type" in the glsl_type sense, so pick a name that reflects that.
Also stick to a single name for the array_size.
Just renames, no behavior change.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23279 >
2023-06-15 03:43:46 +00:00
Jesse Natalie
83f741124b
nir_lower_returns: Mark assert-only var as ASSERTED
...
Fixes: 5d238c0c ("nir_lower_returns: Optimize phis before beginning the pass")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23634 >
2023-06-15 03:09:29 +00:00
Dave Airlie
13df91d7d7
radv/video: restrict the number of IBs on video related queues.
...
The hardware gets given a session context from userspace in each
submission, but if the session context changes the hardware wants
a FENCE to be emitted to know it can give up the current session.
IF a test submits interleaved session ctx access and uses a single
vulkan submit the hardware crashes, unless each IB is submitted
in a separate submission so the fence can be sent.
In theory it could be possible to construct a single command buffer
to trigger this so I do think the hardware should be smarter here.
Should this be fixed in the kernel to always emit a fence between
IBs?
Fixes: dEQP-VK.video.decode.h264_interleaved
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23641 >
2023-06-15 02:49:00 +00:00
LingMan
0535948535
rusticl: fix UB in CLProp machinery
...
Viewing structs as a collection of u8 is not generally sound. Any padding bytes might be
uninitialized and creating an integer from uninitialized memory constitutes producing an invalid
value, which is instant UB.
Since we only copy these bytes around, the fix is to simply work with MaybeUninit<u8>, which can handle uninitialized memory just fine, instead.
See: https://doc.rust-lang.org/reference/behavior-considered-undefined.html
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23652 >
2023-06-15 02:31:19 +00:00
LingMan
fdcb86168d
rusticl: drop cl_prop_for_type macro
...
There's no reason to differentiate between primitive types and structs here. `cl_prop_for_struct`
can handle primitive types just fine.
Drop `cl_prop_for_type` and rename the existing `cl_prop_for_struct` to `cl_prop_for_type`.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23652 >
2023-06-15 02:31:19 +00:00
LingMan
cf43a74c79
rusticl: drop CLProp implementation for String
...
Route the data to the implementation for &str instead. It works just as fine.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23652 >
2023-06-15 02:31:19 +00:00
LingMan
f1461c5a77
rusticl: core: stop using cl_prop from the api module
...
It's a layering violation and really the wrong tool for the job. Add a new fn to view a given slice
as a &[u8] instead of going though the clprop machinery which creates a new Vec.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23652 >
2023-06-15 02:31:18 +00:00
Charmaine Lee
2755519142
svga: fix compute shader type after ntt
...
Reset compute shader type after ntt.
Fixes: 0ac9541804 ("gallium: Drop PIPE_SHADER_CAP_PREFERRED_IR")
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23659 >
2023-06-15 02:11:38 +00:00
Karol Herbst
095fee55f8
rusticl: enforce using unsafe blocks in unsafe functions
...
Signed-off-by: Karol Herbst <git@karolherbst.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23660 >
2023-06-15 01:53:11 +00:00
Mike Blumenkrantz
4edbe8f5a0
zink: add mem debugging
...
modeled off turnip's debug infra, this adds debug printing for oom
scenarios
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23653 >
2023-06-15 01:31:24 +00:00
Mike Blumenkrantz
65fad783c7
zink: break out vk flag unrolling into util function
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23653 >
2023-06-15 01:31:24 +00:00
Ian Romanick
de60b463d7
nir/algebraic: Simplify various trivial bfi
...
These are mostly just obvious patterns that somebody will eventually
want to add.
DG2, Tiger Lake, Ice Lake, Skylake, Broadwell, and Haswell had similar
results (Ice Lake shown)
total instructions in shared programs: 20570033 -> 20570026 (<.01%)
instructions in affected programs: 7363 -> 7356 (-0.10%)
helped: 6 / HURT: 0
total cycles in shared programs: 902118781 -> 902118854 (<.01%)
cycles in affected programs: 419132 -> 419205 (0.02%)
helped: 4 / HURT: 2
DG2, Tiger Lake, Ice Lake, and Skylake had similar results (Ice Lake shown)
Totals:
Instrs: 152819500 -> 152819380 (-0.00%)
Cycles: 15014627187 -> 15014624437 (-0.00%)
Totals from 115 (0.02% of 662497) affected shaders:
Instrs: 28963 -> 28843 (-0.41%)
Cycles: 404582 -> 401832 (-0.68%)
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19968 >
2023-06-14 18:49:53 +00:00
Ian Romanick
541e7eb389
nir/algebraic: Optimize some u2f of bfi
...
v2: Fix a copy-and-paste bug s/('find_lsb', a)/a/ in the patterns. See
piglit!819.
DG2, Tiger Lake, Ice Lake, Skylake, and Broadwell had similar results (Ice Lake shown)
total instructions in shared programs: 20570063 -> 20570033 (<.01%)
instructions in affected programs: 452 -> 422 (-6.64%)
helped: 30 / HURT: 0
total cycles in shared programs: 902118723 -> 902118781 (<.01%)
cycles in affected programs: 1762 -> 1820 (3.29%)
helped: 0 / HURT: 29
DG2, Tiger Lake, Ice Lake, and Skylake had similar results (Ice Lake shown)
Totals:
Instrs: 152819969 -> 152819500 (-0.00%)
Cycles: 15014628652 -> 15014627187 (-0.00%); split: -0.00%, +0.00%
Totals from 469 (0.07% of 662497) affected shaders:
Instrs: 7644 -> 7175 (-6.14%)
Cycles: 31787 -> 30322 (-4.61%); split: -4.90%, +0.29%
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19968 >
2023-06-14 18:49:53 +00:00
Ian Romanick
96cde9cc01
intel/fs: Emit better code for bfi(..., 0)
...
DG2, Tiger Lake, Ice Lake, and Skylake had similar results (Ice Lake shown)
total instructions in shared programs: 20570141 -> 20570063 (<.01%)
instructions in affected programs: 30679 -> 30601 (-0.25%)
helped: 77 / HURT: 0
total cycles in shared programs: 902113977 -> 902118723 (<.01%)
cycles in affected programs: 3255958 -> 3260704 (0.15%)
helped: 60 / HURT: 19
Broadwell
total instructions in shared programs: 18524633 -> 18524547 (<.01%)
instructions in affected programs: 34095 -> 34009 (-0.25%)
helped: 75 / HURT: 2
total cycles in shared programs: 949532394 -> 949543761 (<.01%)
cycles in affected programs: 3419107 -> 3430474 (0.33%)
helped: 57 / HURT: 24
total spills in shared programs: 22484 -> 22484 (0.00%)
spills in affected programs: 516 -> 516 (0.00%)
helped: 2 / HURT: 2
total fills in shared programs: 29346 -> 29338 (-0.03%)
fills in affected programs: 572 -> 564 (-1.40%)
helped: 4 / HURT: 0
Haswell
total instructions in shared programs: 17331356 -> 17331523 (<.01%)
instructions in affected programs: 27920 -> 28087 (0.60%)
helped: 41 / HURT: 4
total cycles in shared programs: 936603192 -> 936574664 (<.01%)
cycles in affected programs: 3417695 -> 3389167 (-0.83%)
helped: 28 / HURT: 21
total spills in shared programs: 19718 -> 19756 (0.19%)
spills in affected programs: 436 -> 474 (8.72%)
helped: 0 / HURT: 4
total fills in shared programs: 22547 -> 22607 (0.27%)
fills in affected programs: 444 -> 504 (13.51%)
helped: 0 / HURT: 4
Ivy Bridge
total cycles in shared programs: 463451277 -> 463451273 (<.01%)
cycles in affected programs: 95870 -> 95866 (<.01%)
helped: 3 / HURT: 2
DG2, Tiger Lake, Ice Lake, and Skylake had similar results (Ice Lake shown)
Totals:
Instrs: 152825278 -> 152819969 (-0.00%); split: -0.00%, +0.00%
Cycles: 15014075626 -> 15014628652 (+0.00%); split: -0.01%, +0.01%
Subgroup size: 8528536 -> 8528560 (+0.00%)
Send messages: 7711431 -> 7711464 (+0.00%)
Spill count: 99907 -> 99509 (-0.40%); split: -0.40%, +0.00%
Fill count: 202459 -> 201598 (-0.43%); split: -0.43%, +0.00%
Scratch Memory Size: 4376576 -> 4371456 (-0.12%)
Totals from 2915 (0.44% of 662497) affected shaders:
Instrs: 2288842 -> 2283533 (-0.23%); split: -0.24%, +0.01%
Cycles: 471633295 -> 472186321 (+0.12%); split: -0.27%, +0.39%
Subgroup size: 27488 -> 27512 (+0.09%)
Send messages: 151344 -> 151377 (+0.02%)
Spill count: 48091 -> 47693 (-0.83%); split: -0.83%, +0.00%
Fill count: 59053 -> 58192 (-1.46%); split: -1.46%, +0.00%
Scratch Memory Size: 1827840 -> 1822720 (-0.28%)
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19968 >
2023-06-14 18:49:53 +00:00
Ian Romanick
6603948a7a
nir/algebraic: Lower some bfi with two constant sources
...
All Haswell and newer Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 19907054 -> 19906882 (<.01%)
instructions in affected programs: 8103 -> 7931 (-2.12%)
helped: 52 / HURT: 0
total cycles in shared programs: 855779334 -> 855781791 (<.01%)
cycles in affected programs: 724201 -> 726658 (0.34%)
helped: 38 / HURT: 7
total sends in shared programs: 1039308 -> 1039302 (<.01%)
sends in affected programs: 162 -> 156 (-3.70%)
helped: 2 / HURT: 0
No shader-db changes on any older Intel platforms.
All Intel platforms had similar restuls. (Ice Lake shown)
Totals:
Instrs: 153117340 -> 152825222 (-0.19%); split: -0.19%, +0.00%
Cycles: 15011904351 -> 15014072944 (+0.01%); split: -0.04%, +0.05%
Send messages: 7711509 -> 7711421 (-0.00%)
Spill count: 100745 -> 99907 (-0.83%); split: -0.85%, +0.02%
Fill count: 203684 -> 202459 (-0.60%); split: -0.62%, +0.02%
Scratch Memory Size: 4403200 -> 4376576 (-0.60%)
Totals from 18603 (2.81% of 662496) affected shaders:
Instrs: 5258303 -> 4966185 (-5.56%); split: -5.56%, +0.00%
Cycles: 447391388 -> 449559981 (+0.48%); split: -1.29%, +1.77%
Send messages: 559231 -> 559143 (-0.02%)
Spill count: 5009 -> 4171 (-16.73%); split: -17.17%, +0.44%
Fill count: 8769 -> 7544 (-13.97%); split: -14.33%, +0.36%
Scratch Memory Size: 194560 -> 167936 (-13.68%)
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19968 >
2023-06-14 18:49:53 +00:00
Ian Romanick
e419eefd34
intel/fs: Use nir_opt_reassociate_bfi
...
All Skylake and newer Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 19907072 -> 19907054 (<.01%)
instructions in affected programs: 8859 -> 8841 (-0.20%)
helped: 9 / HURT: 0
total cycles in shared programs: 855791238 -> 855779334 (<.01%)
cycles in affected programs: 3308294 -> 3296390 (-0.36%)
helped: 12 / HURT: 13
Broadwell
total instructions in shared programs: 17818231 -> 17817440 (<.01%)
instructions in affected programs: 9887 -> 9096 (-8.00%)
helped: 9 / HURT: 0
total cycles in shared programs: 902970035 -> 902941221 (<.01%)
cycles in affected programs: 2767243 -> 2738429 (-1.04%)
helped: 14 / HURT: 5
total spills in shared programs: 17784 -> 17718 (-0.37%)
spills in affected programs: 318 -> 252 (-20.75%)
helped: 1 / HURT: 0
total fills in shared programs: 25458 -> 24949 (-2.00%)
fills in affected programs: 1346 -> 837 (-37.82%)
helped: 1 / HURT: 0
Haswell
total instructions in shared programs: 16707799 -> 16707586 (<.01%)
instructions in affected programs: 24049 -> 23836 (-0.89%)
helped: 41 / HURT: 0
total cycles in shared programs: 882730648 -> 882723174 (<.01%)
cycles in affected programs: 5096737 -> 5089263 (-0.15%)
helped: 25 / HURT: 12
total spills in shared programs: 14937 -> 14909 (-0.19%)
spills in affected programs: 436 -> 408 (-6.42%)
helped: 4 / HURT: 0
total fills in shared programs: 17569 -> 17529 (-0.23%)
fills in affected programs: 444 -> 404 (-9.01%)
helped: 4 / HURT: 0
No shader-db changes on any older Intel platforms.
All Intel platforms had similar results. (Ice Lake shown)
Totals:
Instrs: 153118594 -> 153117340 (-0.00%); split: -0.00%, +0.00%
Cycles: 15011967556 -> 15011904351 (-0.00%); split: -0.00%, +0.00%
Fill count: 203692 -> 203684 (-0.00%)
Totals from 703 (0.11% of 662496) affected shaders:
Instrs: 192826 -> 191572 (-0.65%); split: -0.65%, +0.00%
Cycles: 29937640 -> 29874435 (-0.21%); split: -0.25%, +0.04%
Fill count: 4146 -> 4138 (-0.19%)
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19968 >
2023-06-14 18:49:53 +00:00
Ian Romanick
83bd87c558
nir: Add optimization pass to reassociate some bfi instructions
...
The needs of this pass are ever so slightly more than what
nir_opt_algebraic can do. :( Specifically, it needs to be able to look
at the relationship of constant values used in an expression tree.
v2: Add nir_mov_alu to handle swizzles on the original sources.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19968 >
2023-06-14 18:49:53 +00:00
Mike Blumenkrantz
a085fead0c
zink: add some ci flakes
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23654 >
2023-06-14 18:18:41 +00:00
Daniel Stone
2760aeb13e
CI: Re-enable freedreno CI
...
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23108 >
2023-06-14 17:39:29 +00:00
Daniel Stone
6af691dfff
ci: Extend a618_vk_full runtime
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23108 >
2023-06-14 17:39:29 +00:00
Daniel Stone
c41d493f77
ci: Don't retry manual or scheduled jobs
...
Only retry when there's some kind of non-job failure, such as
runner-internal issues, or API/network issues, etc. If the job itself
fails or times out, then given the length of these jobs, there's no
point trying again and just tying up the job slots for even more hours.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23108 >
2023-06-14 17:39:29 +00:00
Daniel Stone
47991a094e
ci: Elaborate causes for job retries
...
Rather than always retrying, only retry jobs on a limited set of causes.
This notably excludes retries when a job is stuck due to lack of runners
to schedule it; if we can't get a slot on a runner in time, there's no
reason to try again, since our window of opportunity has gone.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23108 >
2023-06-14 17:39:29 +00:00
Emma Anholt
5ef4e1c4c0
ci: Drop some skips of GL CTS ArraysOfArrays tests.
...
My hope is that with my CTS fix, we can complete these all in time now.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23610 >
2023-06-14 16:45:23 +00:00
Emma Anholt
97744f11cf
ci: Drop skips for some previously-invalid CTS tests.
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23610 >
2023-06-14 16:45:23 +00:00
Emma Anholt
8c35537351
ci: Update to vulkan-cts-1.3.5.2 (and pull in some more fixes).
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23610 >
2023-06-14 16:45:23 +00:00
Emma Anholt
e3b0a79b3a
ci/zink: Update current xfails on tgl.
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23610 >
2023-06-14 16:45:23 +00:00
Emma Anholt
10b94772d2
intel: Reduce cost of resetting last_grf_write.
...
In zink-on-anv fs-mod-dvec3-dvec3.shader_test, we were memsetting 2MB of
last_grf_write 2400 times, multiple times through the scheduler. Just
resetting for the processed instructions reduces runtime from 21s to 16s.
No change on steam shader-db runtime across several runs.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23635 >
2023-06-14 16:16:56 +00:00
Emma Anholt
7d4769e802
intel: Allocate the last_grf_write once per scheduler.
...
No need to re-calloc it per block when we're going to use it again. Also,
this fixes the vec4 backend to avoid allocating giant grf_count-sized
arrays on the stack.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23635 >
2023-06-14 16:16:56 +00:00
Emma Anholt
2ad865b219
intel: Count reads_remaining across all blocks.
...
We were zeroing it out per block, but it doesn't actually help to count
per block, since the question is "will scheduling this instruction free
the reg?". Saves some memsetting, which was showing up high in the
profile (but not from this source).
No change on iris SKL shader-db.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23635 >
2023-06-14 16:16:55 +00:00
Mike Blumenkrantz
12a47b84b7
egl/dri2: trigger drawable invalidation from surface queries for zink
...
this mimics dri3 behavior and avoids scenarios where renderbuffers can
get out of sync with their resources
fixes #6744
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22858 >
2023-06-14 15:38:21 +00:00
Mike Blumenkrantz
1563aea69f
lavapipe: add version uuid to shader binary validation
...
this ensures compatible shader binaries across versions
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23636 >
2023-06-14 14:32:36 +00:00
Gert Wollny
b79f6ec397
r600: Disable SB if we use the ariable length DOT
...
sb doesn't know about this instruction, so don't try to run the
optimizer.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23647 >
2023-06-14 13:14:19 +00:00
Gert Wollny
269895c674
600/sfn: Trigger use of ACK for some barriers
...
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23647 >
2023-06-14 13:14:19 +00:00
Gert Wollny
d6280a8eef
r600/sfn: move kill handling to fully scheduling
...
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23647 >
2023-06-14 13:14:19 +00:00
Gert Wollny
f7e6171f3a
r600: fix handling of use_sb flag
...
The compiler will use the unsigned bit pattern of the check and combine this
with the 1 bit, which will always result in use_sb to be zero.
Fix this by making use_sb a bool
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23647 >
2023-06-14 13:14:19 +00:00
Mike Blumenkrantz
4e87d81d20
zink: add a dgc debug mode for testing
...
this is useful for drivers trying to implement DGC since there is no cts
do not use.
it will not make anything faster.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23550 >
2023-06-14 12:37:24 +00:00
Lionel Landwerlin
6b9f838d62
intel/fs: handle load_global_constant_uniform_block_intel
...
Again, load the data just once in GRF, share it across lanes.
Shader-db on dg2:
total instructions in shared programs: 23214555 -> 23215400 (<.01%)
instructions in affected programs: 199977 -> 200822 (0.42%)
helped: 3
HURT: 38
helped stats (abs) min: 5 max: 670 x̄: 283.67 x̃: 176
helped stats (rel) min: 1.34% max: 49.41% x̄: 22.15% x̃: 15.70%
HURT stats (abs) min: 1 max: 185 x̄: 44.63 x̃: 32
HURT stats (rel) min: 0.13% max: 42.86% x̄: 10.25% x̃: 9.30%
95% mean confidence interval for instructions value: -18.65 59.87
95% mean confidence interval for instructions %-change: 3.29% 12.47%
Inconclusive result (value mean confidence interval includes 0).
total loops in shared programs: 5928 -> 5928 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0
total cycles in shared programs: 851137495 -> 851152449 (<.01%)
cycles in affected programs: 16406137 -> 16421091 (0.09%)
helped: 9
HURT: 32
helped stats (abs) min: 10 max: 13498 x̄: 6443.22 x̃: 5581
helped stats (rel) min: 0.11% max: 4.75% x̄: 1.45% x̃: 0.34%
HURT stats (abs) min: 3 max: 15056 x̄: 2279.47 x̃: 735
HURT stats (rel) min: 0.10% max: 23.71% x̄: 4.58% x̃: 4.65%
95% mean confidence interval for cycles value: -1315.40 2044.87
95% mean confidence interval for cycles %-change: 1.71% 4.80%
Inconclusive result (value mean confidence interval includes 0).
total spills in shared programs: 11856 -> 11825 (-0.26%)
spills in affected programs: 2368 -> 2337 (-1.31%)
helped: 4
HURT: 0
total fills in shared programs: 16258 -> 16207 (-0.31%)
fills in affected programs: 2930 -> 2879 (-1.74%)
helped: 4
HURT: 0
total sends in shared programs: 1038194 -> 1038185 (<.01%)
sends in affected programs: 40 -> 31 (-22.50%)
helped: 4
HURT: 0
helped stats (abs) min: 1 max: 4 x̄: 2.25 x̃: 2
helped stats (rel) min: 10.00% max: 33.33% x̄: 21.46% x̃: 21.25%
95% mean confidence interval for sends value: -4.64 0.14
95% mean confidence interval for sends %-change: -40.41% -2.51%
Inconclusive result (value mean confidence interval includes 0).
LOST: 0
GAINED: 0
Some VK/DX titles result (on DG2 only), it's mostly additional
instruction counts except for the unity spaceship demo where a CS
shader gets additional SIMDness. The reason for additional
instructions is that since we're doing block loads, we need to find
the live channels in control flow to select a single lane value that
is valid.
aztec_ruins_high:
Totals from 3 (1.12% of 269) affected shaders:
Instrs: 17732 -> 17896 (+0.92%)
Cycles: 796518 -> 819302 (+2.86%)
cyberpunk_2077:
Totals from 17 (0.17% of 10301) affected shaders:
Instrs: 10848 -> 11658 (+7.47%)
Cycles: 248243 -> 259168 (+4.40%); split: -0.57%, +4.97%
fallout_4_dxvk_g2:
Totals from 2 (0.12% of 1638) affected shaders:
Instrs: 3157 -> 3368 (+6.68%)
Cycles: 487807 -> 490426 (+0.54%); split: -0.26%, +0.79%
Max live registers: 139 -> 141 (+1.44%)
red_dead_redemption2:
Totals from 68 (1.14% of 5970) affected shaders:
Instrs: 34871 -> 36486 (+4.63%)
Cycles: 551430 -> 565211 (+2.50%)
Send messages: 2074 -> 2072 (-0.10%)
Max live registers: 5078 -> 5077 (-0.02%)
total_war_warhammer2:
Totals from 5 (1.05% of 478) affected shaders:
Instrs: 6905 -> 6971 (+0.96%); split: -0.16%, +1.12%
Cycles: 97035 -> 97989 (+0.98%); split: -0.07%, +1.05%
unity spaceship demo (instruction count going up due to a CS shader
bump from SIMD8->16):
Totals from 53 (9.71% of 546) affected shaders:
Instrs: 223748 -> 233223 (+4.23%); split: -0.01%, +4.25%
Cycles: 23134697 -> 25207080 (+8.96%); split: -0.17%, +9.13%
Subgroup size: 480 -> 488 (+1.67%)
Spill count: 2156 -> 2242 (+3.99%); split: -0.19%, +4.17%
Fill count: 4617 -> 4845 (+4.94%); split: -0.09%, +5.02%
Max live registers: 5991 -> 6050 (+0.98%); split: -0.40%, +1.39%
Max dispatch width: 480 -> 488 (+1.67%)
witcher_3_dxvk_g2:
Totals from 27 (2.51% of 1074) affected shaders:
Instrs: 57067 -> 57677 (+1.07%); split: -0.03%, +1.10%
Cycles: 1397871 -> 1436704 (+2.78%); split: -0.35%, +3.13%
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23477 >
2023-06-14 12:04:05 +00:00
Lionel Landwerlin
4ee1a8bb9c
nir: add a load_global_constant uniform intel variant
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23477 >
2023-06-14 12:04:05 +00:00
Lionel Landwerlin
5ae8a78d8c
intel/fs: make use of load_ubo_uniform_block_intel
...
The principle is the same as the load_ssbo_uniform_block_intel.
Whenever we see a uniform offset, load the data only once in GRFs to
reduce register pressure.
Iris shader-db run on DG2 :
total instructions in shared programs: 23001325 -> 23094969 (0.41%)
instructions in affected programs: 1775989 -> 1869633 (5.27%)
helped: 764
HURT: 2097
helped stats (abs) min: 1 max: 102 x̄: 6.96 x̃: 2
helped stats (rel) min: 0.03% max: 16.91% x̄: 1.36% x̃: 0.63%
HURT stats (abs) min: 1 max: 2461 x̄: 47.19 x̃: 7
HURT stats (rel) min: <.01% max: 199.34% x̄: 5.91% x̃: 2.60%
95% mean confidence interval for instructions value: 25.43 40.03
95% mean confidence interval for instructions %-change: 3.60% 4.33%
Instructions are HURT.
total loops in shared programs: 5847 -> 5847 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0
total cycles in shared programs: 839329852 -> 845491482 (0.73%)
cycles in affected programs: 130229434 -> 136391064 (4.73%)
helped: 1098
HURT: 2228
helped stats (abs) min: 1 max: 130102 x̄: 1340.64 x̃: 22
helped stats (rel) min: <.01% max: 64.25% x̄: 4.03% x̃: 0.71%
HURT stats (abs) min: 1 max: 185309 x̄: 3426.24 x̃: 87
HURT stats (rel) min: <.01% max: 92.85% x̄: 8.12% x̃: 3.82%
95% mean confidence interval for cycles value: 1342.16 2362.97
95% mean confidence interval for cycles %-change: 3.70% 4.52%
Cycles are HURT.
total spills in shared programs: 10768 -> 11856 (10.10%)
spills in affected programs: 9717 -> 10805 (11.20%)
helped: 25
HURT: 28
total fills in shared programs: 13720 -> 16258 (18.50%)
fills in affected programs: 12016 -> 14554 (21.12%)
helped: 25
HURT: 28
total sends in shared programs: 1034790 -> 1031266 (-0.34%)
sends in affected programs: 33416 -> 29892 (-10.55%)
helped: 1005
HURT: 0
helped stats (abs) min: 1 max: 22 x̄: 3.51 x̃: 3
helped stats (rel) min: 1.69% max: 60.00% x̄: 15.20% x̃: 14.08%
95% mean confidence interval for sends value: -3.72 -3.29
95% mean confidence interval for sends %-change: -15.82% -14.57%
Sends are helped.
LOST: 26
GAINED: 183
shader-db on a number of VK/DX titles on DG2 :
PERCENTAGE DELTAS Shaders Instrs Cycles
age_of_wonders_III 1928 +0.02% -0.19%
PERCENTAGE DELTAS Shaders Instrs Cycles Subgroup size Send messages Spill count Fill count Max live registers Max dispatch width
assassins_creed_odyssey 2119 +1.12% -0.42% -0.03% -0.29% -9.10% -4.26% -0.64% +0.65%
PERCENTAGE DELTAS Shaders Instrs Cycles Spill count Fill count Max live registers
aztec_ruins_high 269 -0.05% -0.45% -0.29% -7.27% -0.33%
PERCENTAGE DELTAS Shaders Instrs Cycles Max live registers Max dispatch width
dark_souls_3_dxvk_g2 1420 +0.09% +0.24% +0.21% +0.12%
(stats look bad, but it's just one shader affected)
PERCENTAGE DELTAS Shaders Instrs Cycles Spill count Fill count Scratch Memory Size Max live registers
fallout_4_dxvk_g2 1638 +0.67% +8.32% +16.02% +7.17% +100.00% +0.48%
PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Spill count Fill count Max live registers Max dispatch width
red_dead_redemption2 5969 +0.16% -0.04% -0.04% +0.01% +0.05% -0.20% +0.04%
PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Max live registers Max dispatch width
rise_of_the_tomb_raider_g2 12129 +2.19% +1.36% -1.23% -0.36% +2.04%
PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Max live registers
shooter-game 693 +0.07% -0.89% -0.09% -0.09%
PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Max live registers Max dispatch width
talos_g2 1140 +0.37% +3.80% -0.86% -0.67% +0.19%
PERCENTAGE DELTAS Shaders Instrs Cycles Max live registers Max dispatch width
total_war_warhammer2 477 +0.25% +0.66% -0.17% +0.10%
PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Max live registers Max dispatch width
witcher_3_dxvk_g2 1074 +0.75% -10.45% -0.15% -0.16% -0.16%
PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Max live registers
wolfenstein_youngblood 1111 +0.52% +0.66% -0.59% -0.03%
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23477 >
2023-06-14 12:04:05 +00:00
Lionel Landwerlin
4a23a5a904
nir: add a new ubo uniform loading intrinsic for intel
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23477 >
2023-06-14 12:04:05 +00:00
Lionel Landwerlin
7eb1e2a690
intel/fs: avoid reusing the VGRF for uniform load_ubo
...
Only found 3 shaders affected in Red Dead Redemption :
Totals from 3 (0.05% of 5969) affected shaders:
Instrs: 2246 -> 2230 (-0.71%)
Cycles: 156506 -> 148402 (-5.18%); split: -5.23%, +0.05%
This will have a larger effect when we add the
load_ubo_uniform_block_intel intrinsic where we will have larger
blocks (vec8/vec16 vs vec4 only now).
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23477 >
2023-06-14 12:04:05 +00:00
Lionel Landwerlin
ff3494fce3
intel/fs: print identation for control flow
...
INTEL_DEBUG=optimizer output changes from :
{ 10} 40: cmp.nz.f0.0(8) null:F, vgrf3470:F, 0f
{ 10} 41: (+f0.0) if(8) (null):UD,
{ 11} 42: txf_logical(8) vgrf3473:UD, vgrf250:D(null):UD, 0d(null):UD(null):UD(null):UD(null):UD, 31u, 0u(null):UD(null):UD(null):UD, 3d, 0d
{ 12} 43: and(8) vgrf262:UD, vgrf3473:UD, 2u
{ 11} 44: cmp.nz.f0.0(8) null:D, vgrf262:D, 0d
{ 10} 45: (+f0.0) if(8) (null):UD,
{ 11} 46: mov(8) vgrf270:D, -1082130432d
{ 12} 47: mov(8) vgrf271:D, 1082130432d
{ 14} 48: mov(8) vgrf274+0.0:D, 0d
{ 14} 49: mov(8) vgrf274+1.0:D, 0d
to :
{ 10} 40: cmp.nz.f0.0(8) null:F, vgrf3470:F, 0f
{ 10} 41: (+f0.0) if(8) (null):UD,
{ 11} 42: txf_logical(8) vgrf3473:UD, vgrf250:D(null):UD, 0d(null):UD(null):UD(null):UD(null):UD, 31u, 0u(null):UD(null):UD(null):UD, 3d, 0d
{ 12} 43: and(8) vgrf262:UD, vgrf3473:UD, 2u
{ 11} 44: cmp.nz.f0.0(8) null:D, vgrf262:D, 0d
{ 10} 45: (+f0.0) if(8) (null):UD,
{ 11} 46: mov(8) vgrf270:D, -1082130432d
{ 12} 47: mov(8) vgrf271:D, 1082130432d
{ 14} 48: mov(8) vgrf274+0.0:D, 0d
{ 14} 49: mov(8) vgrf274+1.0:D, 0d
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23477 >
2023-06-14 12:04:05 +00:00
Karol Herbst
5b3ff7e3f3
rusticl/queue: overhaul of the queue+event handling
...
This new approach handles things as follows:
1. Fences won't be attached to events anymore, applications only wait on
the cv attached to the event.
2. Only the queue is allowed to update event status for non user events.
This will eliminate all remaining status updating races between the
queue and applications waiting on events.
3. Queue minimized flushing by bundling events
4. Increase cv wait timeout as there is really no point in waking up too
often.
Reduces amount of emited fences on radeonsi in luxmark 3.1 luxball by 90%
Signed-off-by: Karol Herbst <git@karolherbst.de>
Reviewed by Nora Allen <blackcatgames@protonmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23612 >
2023-06-14 11:14:46 +00:00
Iago Toral Quiroga
6114e66124
broadcom/compiler: only use last thread switch flag to detect final section
...
Since commit 'c98ddc778a3 broadcom/compiler: force a last thrsw for spilling'
we always ensure we signal the last thread section explicitly with a
last thread switch.
Relying on VPM stores to detect the last thread section is particularly bad,
because we can have VPM stores occurring quite early in a shader program,
which would disable TMU spilling almost entirely.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22461 >
2023-06-14 09:27:50 +00:00
Alejandro Piñeiro
dfdbf5bf94
broadcom/compiler: clarify use of QFILE_VPM
...
This was only used for version < 40 (See commit 22a02f3e3 ).
Adding some extra explanations and asserts of places where it is used.
As we are here also move the definition of a register with QFILE_VPM,
to avoid defining it if not needed.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22984 >
2023-06-14 09:03:35 +00:00
Lionel Landwerlin
0cd9f0c3d3
intel/fs: fix bindless/shared surface mistake
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 068bf1378d ("intel/fs: enable SSBO accesses through the bindless heap")
Tested-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23536 >
2023-06-14 07:42:57 +00:00
Lionel Landwerlin
b3b12c2c27
anv: enable CmdCopyQueryPoolResults to use shader for copies
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23074 >
2023-06-14 09:43:57 +03:00
Lionel Landwerlin
e86f3c7abb
intel/ds: add query count in query tracepoints
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23074 >
2023-06-14 09:43:57 +03:00
Lionel Landwerlin
930e862af7
anv: add shaders for copying query results
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23074 >
2023-06-14 09:43:57 +03:00