Lionel Landwerlin
0e728ea7b0
anv: avoid private buffer allocations in vkGetDeviceImageMemoryRequirementsKHR
...
The whole point of vkGetDeviceImageMemoryRequirementsKHR is to avoid
creating an image so we should completely avoid any allocation like
the private binding.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 4075dd16ab ("anv: implement vkGetDeviceImageMemoryRequirementsKHR")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23720 >
2023-06-19 18:35:30 +03:00
Marek Olšák
da4b5b4a47
intel/ci: disable iris-jsl-deqp because it always fails for an AMD MR
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23687 >
2023-06-17 23:42:21 +00:00
Iván Briano
6e5eb0afd3
anv: do not explode on 32 bit builds
...
Fixes: 930e862af7 ("anv: add shaders for copying query results")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9213
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23686 >
2023-06-16 22:22:07 +00:00
Jianxun Zhang
f71c42bc2c
intel/isl: Add MTL MC CCS modifier into modifier info
...
Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20327 >
2023-06-16 17:32:10 +00:00
Jianxun Zhang
a45f5500dd
intel/isl: Add MTL RC CCS CC modifier into modifier info
...
Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20327 >
2023-06-16 17:32:10 +00:00
Jianxun Zhang
898c7252c1
intel/isl: Add MTL RC CCS modifier into modifier info
...
Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20327 >
2023-06-16 17:32:10 +00:00
Tapani Pälli
8f2680871d
anv: convert most pc in genX_cmd_buffer to use pc helper
...
Some are left, batch_set_preemption does not have devinfo pointer
and IndirectStatePointersDisable does not have corresponding ANV bit.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23583 >
2023-06-16 08:04:20 +00:00
Tapani Pälli
9f3b51255a
anv: change most pipe controls in gfx8_cmd_buffer to use pc helper
...
One using a flag (PSDSyncEnable) that has no corresponding ANV bit.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23583 >
2023-06-16 08:04:20 +00:00
Tapani Pälli
b3589a8899
anv: change pipe control in indirect draw gen to use pc helper
...
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23583 >
2023-06-16 08:04:20 +00:00
Tapani Pälli
6a7dcd3e12
anv: change pipe controls in genX_gpu_memcpy to use pc helper
...
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23583 >
2023-06-16 08:04:20 +00:00
Tapani Pälli
e70cf3ea98
anv: change pipe control in genX_pipeline to use pc helper
...
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23583 >
2023-06-16 08:04:20 +00:00
Tapani Pälli
c232269db4
anv: change pipe controls in genX_state to use pc helper
...
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23583 >
2023-06-16 08:04:20 +00:00
Tapani Pälli
6dc95685f3
anv: convert genX_query pipe controls to use pc helper
...
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23583 >
2023-06-16 08:04:20 +00:00
Tapani Pälli
d8c76f8844
anv: implement invalidate part of emit_apply_pipe_flushes with helper
...
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23583 >
2023-06-16 08:04:20 +00:00
Tapani Pälli
9f6f69e0f9
anv: implement flush part of emit_apply_pipe_flushes with helper
...
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23583 >
2023-06-16 08:04:20 +00:00
Tapani Pälli
c3658f5a5d
anv: wrap pipe control emission to a set of helper functions
...
This makes it possible to have HW specific rules and WA's implemented
in a central place. Also all pipe controls will get anv_debug_dump_pc.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23583 >
2023-06-16 08:04:20 +00:00
Yiwei Zhang
c8d8961f33
anv: avoid requiring ordered memory planes for explicit import
...
The spec does not have such requirement, but anv requires it for
validating the offset. However, for DRM_FORMAT_YVU420, chroma channels
can be swapped upon import to match B/R channel order of
VK_FORMAT_G8_B8_R8_3PLANE_420_UNORM.
This fixes some sw codec path in Instagram when interop with gpu.
v2: fix image memory requirement for re-ordered explicit import
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Emma Anholt <emma@anholt.net> (v1)
Reviewed-by: Matt Tuner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23643 >
2023-06-15 17:53:10 +00:00
Nanley Chery
f0b6b57c77
intel/blorp: Avoid 32bpc fast clear sampling issue
...
For 32bpc formats, the ICL+ sampler fetches the raw clear color dwords
used for rendering instead of the converted pixel dwords typically used
for sampling. The CLEAR_COLOR struct page documents this for 128bpp
formats, but not for 32bpp and 64bpp formats.
In blorp_copy, map R11G11B10_FLOAT to R8G8B8A8_UINT instead of R32_UINT.
This will cause the sampler to fetch the clear color pixel, allowing
drivers to keep clear color support enabled during copies.
If iris is forced to convert blits to copies, this patch fixes the
following test on gfx12:
dEQP-GLES3.functional.fbo.color.repeated_clear.blit.rbo.r11f_g11f_b10f
At the moment, both iris and anv won't hit this issue outside of
blorp_copy. This is due to the read/write access restrictions they
currently place on texture views that reinterpret the surface format.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8964
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23604 >
2023-06-15 14:17:49 +00:00
Erik Faye-Lund
a593de7cf3
nir: add missed nir_cmp_imm-helpers
...
Seems I missed these in my previous round, let's fix them up now!
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23461 >
2023-06-15 13:34:49 +00:00
Erik Faye-Lund
2a71e332aa
nir: use new immediate comparison helpers
...
There's plenty of places we can use these new and shiny helpers, so
let's clean up the code a bit.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23460 >
2023-06-15 13:33:58 +02:00
Ian Romanick
96cde9cc01
intel/fs: Emit better code for bfi(..., 0)
...
DG2, Tiger Lake, Ice Lake, and Skylake had similar results (Ice Lake shown)
total instructions in shared programs: 20570141 -> 20570063 (<.01%)
instructions in affected programs: 30679 -> 30601 (-0.25%)
helped: 77 / HURT: 0
total cycles in shared programs: 902113977 -> 902118723 (<.01%)
cycles in affected programs: 3255958 -> 3260704 (0.15%)
helped: 60 / HURT: 19
Broadwell
total instructions in shared programs: 18524633 -> 18524547 (<.01%)
instructions in affected programs: 34095 -> 34009 (-0.25%)
helped: 75 / HURT: 2
total cycles in shared programs: 949532394 -> 949543761 (<.01%)
cycles in affected programs: 3419107 -> 3430474 (0.33%)
helped: 57 / HURT: 24
total spills in shared programs: 22484 -> 22484 (0.00%)
spills in affected programs: 516 -> 516 (0.00%)
helped: 2 / HURT: 2
total fills in shared programs: 29346 -> 29338 (-0.03%)
fills in affected programs: 572 -> 564 (-1.40%)
helped: 4 / HURT: 0
Haswell
total instructions in shared programs: 17331356 -> 17331523 (<.01%)
instructions in affected programs: 27920 -> 28087 (0.60%)
helped: 41 / HURT: 4
total cycles in shared programs: 936603192 -> 936574664 (<.01%)
cycles in affected programs: 3417695 -> 3389167 (-0.83%)
helped: 28 / HURT: 21
total spills in shared programs: 19718 -> 19756 (0.19%)
spills in affected programs: 436 -> 474 (8.72%)
helped: 0 / HURT: 4
total fills in shared programs: 22547 -> 22607 (0.27%)
fills in affected programs: 444 -> 504 (13.51%)
helped: 0 / HURT: 4
Ivy Bridge
total cycles in shared programs: 463451277 -> 463451273 (<.01%)
cycles in affected programs: 95870 -> 95866 (<.01%)
helped: 3 / HURT: 2
DG2, Tiger Lake, Ice Lake, and Skylake had similar results (Ice Lake shown)
Totals:
Instrs: 152825278 -> 152819969 (-0.00%); split: -0.00%, +0.00%
Cycles: 15014075626 -> 15014628652 (+0.00%); split: -0.01%, +0.01%
Subgroup size: 8528536 -> 8528560 (+0.00%)
Send messages: 7711431 -> 7711464 (+0.00%)
Spill count: 99907 -> 99509 (-0.40%); split: -0.40%, +0.00%
Fill count: 202459 -> 201598 (-0.43%); split: -0.43%, +0.00%
Scratch Memory Size: 4376576 -> 4371456 (-0.12%)
Totals from 2915 (0.44% of 662497) affected shaders:
Instrs: 2288842 -> 2283533 (-0.23%); split: -0.24%, +0.01%
Cycles: 471633295 -> 472186321 (+0.12%); split: -0.27%, +0.39%
Subgroup size: 27488 -> 27512 (+0.09%)
Send messages: 151344 -> 151377 (+0.02%)
Spill count: 48091 -> 47693 (-0.83%); split: -0.83%, +0.00%
Fill count: 59053 -> 58192 (-1.46%); split: -1.46%, +0.00%
Scratch Memory Size: 1827840 -> 1822720 (-0.28%)
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19968 >
2023-06-14 18:49:53 +00:00
Ian Romanick
e419eefd34
intel/fs: Use nir_opt_reassociate_bfi
...
All Skylake and newer Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 19907072 -> 19907054 (<.01%)
instructions in affected programs: 8859 -> 8841 (-0.20%)
helped: 9 / HURT: 0
total cycles in shared programs: 855791238 -> 855779334 (<.01%)
cycles in affected programs: 3308294 -> 3296390 (-0.36%)
helped: 12 / HURT: 13
Broadwell
total instructions in shared programs: 17818231 -> 17817440 (<.01%)
instructions in affected programs: 9887 -> 9096 (-8.00%)
helped: 9 / HURT: 0
total cycles in shared programs: 902970035 -> 902941221 (<.01%)
cycles in affected programs: 2767243 -> 2738429 (-1.04%)
helped: 14 / HURT: 5
total spills in shared programs: 17784 -> 17718 (-0.37%)
spills in affected programs: 318 -> 252 (-20.75%)
helped: 1 / HURT: 0
total fills in shared programs: 25458 -> 24949 (-2.00%)
fills in affected programs: 1346 -> 837 (-37.82%)
helped: 1 / HURT: 0
Haswell
total instructions in shared programs: 16707799 -> 16707586 (<.01%)
instructions in affected programs: 24049 -> 23836 (-0.89%)
helped: 41 / HURT: 0
total cycles in shared programs: 882730648 -> 882723174 (<.01%)
cycles in affected programs: 5096737 -> 5089263 (-0.15%)
helped: 25 / HURT: 12
total spills in shared programs: 14937 -> 14909 (-0.19%)
spills in affected programs: 436 -> 408 (-6.42%)
helped: 4 / HURT: 0
total fills in shared programs: 17569 -> 17529 (-0.23%)
fills in affected programs: 444 -> 404 (-9.01%)
helped: 4 / HURT: 0
No shader-db changes on any older Intel platforms.
All Intel platforms had similar results. (Ice Lake shown)
Totals:
Instrs: 153118594 -> 153117340 (-0.00%); split: -0.00%, +0.00%
Cycles: 15011967556 -> 15011904351 (-0.00%); split: -0.00%, +0.00%
Fill count: 203692 -> 203684 (-0.00%)
Totals from 703 (0.11% of 662496) affected shaders:
Instrs: 192826 -> 191572 (-0.65%); split: -0.65%, +0.00%
Cycles: 29937640 -> 29874435 (-0.21%); split: -0.25%, +0.04%
Fill count: 4146 -> 4138 (-0.19%)
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19968 >
2023-06-14 18:49:53 +00:00
Emma Anholt
5ef4e1c4c0
ci: Drop some skips of GL CTS ArraysOfArrays tests.
...
My hope is that with my CTS fix, we can complete these all in time now.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23610 >
2023-06-14 16:45:23 +00:00
Emma Anholt
8c35537351
ci: Update to vulkan-cts-1.3.5.2 (and pull in some more fixes).
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23610 >
2023-06-14 16:45:23 +00:00
Emma Anholt
10b94772d2
intel: Reduce cost of resetting last_grf_write.
...
In zink-on-anv fs-mod-dvec3-dvec3.shader_test, we were memsetting 2MB of
last_grf_write 2400 times, multiple times through the scheduler. Just
resetting for the processed instructions reduces runtime from 21s to 16s.
No change on steam shader-db runtime across several runs.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23635 >
2023-06-14 16:16:56 +00:00
Emma Anholt
7d4769e802
intel: Allocate the last_grf_write once per scheduler.
...
No need to re-calloc it per block when we're going to use it again. Also,
this fixes the vec4 backend to avoid allocating giant grf_count-sized
arrays on the stack.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23635 >
2023-06-14 16:16:56 +00:00
Emma Anholt
2ad865b219
intel: Count reads_remaining across all blocks.
...
We were zeroing it out per block, but it doesn't actually help to count
per block, since the question is "will scheduling this instruction free
the reg?". Saves some memsetting, which was showing up high in the
profile (but not from this source).
No change on iris SKL shader-db.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23635 >
2023-06-14 16:16:55 +00:00
Lionel Landwerlin
6b9f838d62
intel/fs: handle load_global_constant_uniform_block_intel
...
Again, load the data just once in GRF, share it across lanes.
Shader-db on dg2:
total instructions in shared programs: 23214555 -> 23215400 (<.01%)
instructions in affected programs: 199977 -> 200822 (0.42%)
helped: 3
HURT: 38
helped stats (abs) min: 5 max: 670 x̄: 283.67 x̃: 176
helped stats (rel) min: 1.34% max: 49.41% x̄: 22.15% x̃: 15.70%
HURT stats (abs) min: 1 max: 185 x̄: 44.63 x̃: 32
HURT stats (rel) min: 0.13% max: 42.86% x̄: 10.25% x̃: 9.30%
95% mean confidence interval for instructions value: -18.65 59.87
95% mean confidence interval for instructions %-change: 3.29% 12.47%
Inconclusive result (value mean confidence interval includes 0).
total loops in shared programs: 5928 -> 5928 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0
total cycles in shared programs: 851137495 -> 851152449 (<.01%)
cycles in affected programs: 16406137 -> 16421091 (0.09%)
helped: 9
HURT: 32
helped stats (abs) min: 10 max: 13498 x̄: 6443.22 x̃: 5581
helped stats (rel) min: 0.11% max: 4.75% x̄: 1.45% x̃: 0.34%
HURT stats (abs) min: 3 max: 15056 x̄: 2279.47 x̃: 735
HURT stats (rel) min: 0.10% max: 23.71% x̄: 4.58% x̃: 4.65%
95% mean confidence interval for cycles value: -1315.40 2044.87
95% mean confidence interval for cycles %-change: 1.71% 4.80%
Inconclusive result (value mean confidence interval includes 0).
total spills in shared programs: 11856 -> 11825 (-0.26%)
spills in affected programs: 2368 -> 2337 (-1.31%)
helped: 4
HURT: 0
total fills in shared programs: 16258 -> 16207 (-0.31%)
fills in affected programs: 2930 -> 2879 (-1.74%)
helped: 4
HURT: 0
total sends in shared programs: 1038194 -> 1038185 (<.01%)
sends in affected programs: 40 -> 31 (-22.50%)
helped: 4
HURT: 0
helped stats (abs) min: 1 max: 4 x̄: 2.25 x̃: 2
helped stats (rel) min: 10.00% max: 33.33% x̄: 21.46% x̃: 21.25%
95% mean confidence interval for sends value: -4.64 0.14
95% mean confidence interval for sends %-change: -40.41% -2.51%
Inconclusive result (value mean confidence interval includes 0).
LOST: 0
GAINED: 0
Some VK/DX titles result (on DG2 only), it's mostly additional
instruction counts except for the unity spaceship demo where a CS
shader gets additional SIMDness. The reason for additional
instructions is that since we're doing block loads, we need to find
the live channels in control flow to select a single lane value that
is valid.
aztec_ruins_high:
Totals from 3 (1.12% of 269) affected shaders:
Instrs: 17732 -> 17896 (+0.92%)
Cycles: 796518 -> 819302 (+2.86%)
cyberpunk_2077:
Totals from 17 (0.17% of 10301) affected shaders:
Instrs: 10848 -> 11658 (+7.47%)
Cycles: 248243 -> 259168 (+4.40%); split: -0.57%, +4.97%
fallout_4_dxvk_g2:
Totals from 2 (0.12% of 1638) affected shaders:
Instrs: 3157 -> 3368 (+6.68%)
Cycles: 487807 -> 490426 (+0.54%); split: -0.26%, +0.79%
Max live registers: 139 -> 141 (+1.44%)
red_dead_redemption2:
Totals from 68 (1.14% of 5970) affected shaders:
Instrs: 34871 -> 36486 (+4.63%)
Cycles: 551430 -> 565211 (+2.50%)
Send messages: 2074 -> 2072 (-0.10%)
Max live registers: 5078 -> 5077 (-0.02%)
total_war_warhammer2:
Totals from 5 (1.05% of 478) affected shaders:
Instrs: 6905 -> 6971 (+0.96%); split: -0.16%, +1.12%
Cycles: 97035 -> 97989 (+0.98%); split: -0.07%, +1.05%
unity spaceship demo (instruction count going up due to a CS shader
bump from SIMD8->16):
Totals from 53 (9.71% of 546) affected shaders:
Instrs: 223748 -> 233223 (+4.23%); split: -0.01%, +4.25%
Cycles: 23134697 -> 25207080 (+8.96%); split: -0.17%, +9.13%
Subgroup size: 480 -> 488 (+1.67%)
Spill count: 2156 -> 2242 (+3.99%); split: -0.19%, +4.17%
Fill count: 4617 -> 4845 (+4.94%); split: -0.09%, +5.02%
Max live registers: 5991 -> 6050 (+0.98%); split: -0.40%, +1.39%
Max dispatch width: 480 -> 488 (+1.67%)
witcher_3_dxvk_g2:
Totals from 27 (2.51% of 1074) affected shaders:
Instrs: 57067 -> 57677 (+1.07%); split: -0.03%, +1.10%
Cycles: 1397871 -> 1436704 (+2.78%); split: -0.35%, +3.13%
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23477 >
2023-06-14 12:04:05 +00:00
Lionel Landwerlin
5ae8a78d8c
intel/fs: make use of load_ubo_uniform_block_intel
...
The principle is the same as the load_ssbo_uniform_block_intel.
Whenever we see a uniform offset, load the data only once in GRFs to
reduce register pressure.
Iris shader-db run on DG2 :
total instructions in shared programs: 23001325 -> 23094969 (0.41%)
instructions in affected programs: 1775989 -> 1869633 (5.27%)
helped: 764
HURT: 2097
helped stats (abs) min: 1 max: 102 x̄: 6.96 x̃: 2
helped stats (rel) min: 0.03% max: 16.91% x̄: 1.36% x̃: 0.63%
HURT stats (abs) min: 1 max: 2461 x̄: 47.19 x̃: 7
HURT stats (rel) min: <.01% max: 199.34% x̄: 5.91% x̃: 2.60%
95% mean confidence interval for instructions value: 25.43 40.03
95% mean confidence interval for instructions %-change: 3.60% 4.33%
Instructions are HURT.
total loops in shared programs: 5847 -> 5847 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0
total cycles in shared programs: 839329852 -> 845491482 (0.73%)
cycles in affected programs: 130229434 -> 136391064 (4.73%)
helped: 1098
HURT: 2228
helped stats (abs) min: 1 max: 130102 x̄: 1340.64 x̃: 22
helped stats (rel) min: <.01% max: 64.25% x̄: 4.03% x̃: 0.71%
HURT stats (abs) min: 1 max: 185309 x̄: 3426.24 x̃: 87
HURT stats (rel) min: <.01% max: 92.85% x̄: 8.12% x̃: 3.82%
95% mean confidence interval for cycles value: 1342.16 2362.97
95% mean confidence interval for cycles %-change: 3.70% 4.52%
Cycles are HURT.
total spills in shared programs: 10768 -> 11856 (10.10%)
spills in affected programs: 9717 -> 10805 (11.20%)
helped: 25
HURT: 28
total fills in shared programs: 13720 -> 16258 (18.50%)
fills in affected programs: 12016 -> 14554 (21.12%)
helped: 25
HURT: 28
total sends in shared programs: 1034790 -> 1031266 (-0.34%)
sends in affected programs: 33416 -> 29892 (-10.55%)
helped: 1005
HURT: 0
helped stats (abs) min: 1 max: 22 x̄: 3.51 x̃: 3
helped stats (rel) min: 1.69% max: 60.00% x̄: 15.20% x̃: 14.08%
95% mean confidence interval for sends value: -3.72 -3.29
95% mean confidence interval for sends %-change: -15.82% -14.57%
Sends are helped.
LOST: 26
GAINED: 183
shader-db on a number of VK/DX titles on DG2 :
PERCENTAGE DELTAS Shaders Instrs Cycles
age_of_wonders_III 1928 +0.02% -0.19%
PERCENTAGE DELTAS Shaders Instrs Cycles Subgroup size Send messages Spill count Fill count Max live registers Max dispatch width
assassins_creed_odyssey 2119 +1.12% -0.42% -0.03% -0.29% -9.10% -4.26% -0.64% +0.65%
PERCENTAGE DELTAS Shaders Instrs Cycles Spill count Fill count Max live registers
aztec_ruins_high 269 -0.05% -0.45% -0.29% -7.27% -0.33%
PERCENTAGE DELTAS Shaders Instrs Cycles Max live registers Max dispatch width
dark_souls_3_dxvk_g2 1420 +0.09% +0.24% +0.21% +0.12%
(stats look bad, but it's just one shader affected)
PERCENTAGE DELTAS Shaders Instrs Cycles Spill count Fill count Scratch Memory Size Max live registers
fallout_4_dxvk_g2 1638 +0.67% +8.32% +16.02% +7.17% +100.00% +0.48%
PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Spill count Fill count Max live registers Max dispatch width
red_dead_redemption2 5969 +0.16% -0.04% -0.04% +0.01% +0.05% -0.20% +0.04%
PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Max live registers Max dispatch width
rise_of_the_tomb_raider_g2 12129 +2.19% +1.36% -1.23% -0.36% +2.04%
PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Max live registers
shooter-game 693 +0.07% -0.89% -0.09% -0.09%
PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Max live registers Max dispatch width
talos_g2 1140 +0.37% +3.80% -0.86% -0.67% +0.19%
PERCENTAGE DELTAS Shaders Instrs Cycles Max live registers Max dispatch width
total_war_warhammer2 477 +0.25% +0.66% -0.17% +0.10%
PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Max live registers Max dispatch width
witcher_3_dxvk_g2 1074 +0.75% -10.45% -0.15% -0.16% -0.16%
PERCENTAGE DELTAS Shaders Instrs Cycles Send messages Max live registers
wolfenstein_youngblood 1111 +0.52% +0.66% -0.59% -0.03%
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23477 >
2023-06-14 12:04:05 +00:00
Lionel Landwerlin
7eb1e2a690
intel/fs: avoid reusing the VGRF for uniform load_ubo
...
Only found 3 shaders affected in Red Dead Redemption :
Totals from 3 (0.05% of 5969) affected shaders:
Instrs: 2246 -> 2230 (-0.71%)
Cycles: 156506 -> 148402 (-5.18%); split: -5.23%, +0.05%
This will have a larger effect when we add the
load_ubo_uniform_block_intel intrinsic where we will have larger
blocks (vec8/vec16 vs vec4 only now).
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23477 >
2023-06-14 12:04:05 +00:00
Lionel Landwerlin
ff3494fce3
intel/fs: print identation for control flow
...
INTEL_DEBUG=optimizer output changes from :
{ 10} 40: cmp.nz.f0.0(8) null:F, vgrf3470:F, 0f
{ 10} 41: (+f0.0) if(8) (null):UD,
{ 11} 42: txf_logical(8) vgrf3473:UD, vgrf250:D(null):UD, 0d(null):UD(null):UD(null):UD(null):UD, 31u, 0u(null):UD(null):UD(null):UD, 3d, 0d
{ 12} 43: and(8) vgrf262:UD, vgrf3473:UD, 2u
{ 11} 44: cmp.nz.f0.0(8) null:D, vgrf262:D, 0d
{ 10} 45: (+f0.0) if(8) (null):UD,
{ 11} 46: mov(8) vgrf270:D, -1082130432d
{ 12} 47: mov(8) vgrf271:D, 1082130432d
{ 14} 48: mov(8) vgrf274+0.0:D, 0d
{ 14} 49: mov(8) vgrf274+1.0:D, 0d
to :
{ 10} 40: cmp.nz.f0.0(8) null:F, vgrf3470:F, 0f
{ 10} 41: (+f0.0) if(8) (null):UD,
{ 11} 42: txf_logical(8) vgrf3473:UD, vgrf250:D(null):UD, 0d(null):UD(null):UD(null):UD(null):UD, 31u, 0u(null):UD(null):UD(null):UD, 3d, 0d
{ 12} 43: and(8) vgrf262:UD, vgrf3473:UD, 2u
{ 11} 44: cmp.nz.f0.0(8) null:D, vgrf262:D, 0d
{ 10} 45: (+f0.0) if(8) (null):UD,
{ 11} 46: mov(8) vgrf270:D, -1082130432d
{ 12} 47: mov(8) vgrf271:D, 1082130432d
{ 14} 48: mov(8) vgrf274+0.0:D, 0d
{ 14} 49: mov(8) vgrf274+1.0:D, 0d
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23477 >
2023-06-14 12:04:05 +00:00
Lionel Landwerlin
0cd9f0c3d3
intel/fs: fix bindless/shared surface mistake
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 068bf1378d ("intel/fs: enable SSBO accesses through the bindless heap")
Tested-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23536 >
2023-06-14 07:42:57 +00:00
Lionel Landwerlin
b3b12c2c27
anv: enable CmdCopyQueryPoolResults to use shader for copies
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23074 >
2023-06-14 09:43:57 +03:00
Lionel Landwerlin
e86f3c7abb
intel/ds: add query count in query tracepoints
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23074 >
2023-06-14 09:43:57 +03:00
Lionel Landwerlin
930e862af7
anv: add shaders for copying query results
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23074 >
2023-06-14 09:43:57 +03:00
Lionel Landwerlin
4cee8ce7a5
anv: generalize internal kernel concept
...
We'll add more of those kernels for other purposes.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23074 >
2023-06-14 09:43:57 +03:00
Lionel Landwerlin
7ca5c84804
anv: add support for simple internal compute shaders
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23074 >
2023-06-14 09:43:57 +03:00
Lionel Landwerlin
dbbcd5c32c
anv: factor out generation kernel dispatch into helper
...
We would like to reuse this mechanism to dispatch different types of
internal shader. Those would replace some of the command streamer
commands we currently use.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23074 >
2023-06-14 09:43:57 +03:00
Lionel Landwerlin
455a13fb7f
anv: limit ANV_PIPE_RENDER_TARGET_BUFFER_WRITES to blorp operations using 3D
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23074 >
2023-06-14 09:43:56 +03:00
Lionel Landwerlin
d7c28e526b
anv: fix incorrect batch for 3DSTATE_CONSTANT_ALL emission
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: c950fe97a0 ("anv: implement generated (indexed) indirect draws")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23074 >
2023-06-14 09:43:56 +03:00
Lionel Landwerlin
0da39bf8ee
anv: disable mesh/task for generated draws
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: c950fe97a0 ("anv: implement generated (indexed) indirect draws")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23074 >
2023-06-14 09:43:56 +03:00
Lionel Landwerlin
e9c1eaa535
anv: only disable mesh when enabled at the VkDevice level
...
Saving ourselves some instructions since it's not going to get used.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23074 >
2023-06-14 09:43:56 +03:00
Lionel Landwerlin
efd4a162d3
anv: always report all pipeline stats regardless of stages
...
Tools like the scripts in shader-db expect all the fields to be there,
as the stats are put into a CSV file. So just report 0 if a stage
doesn't support workgroup memory size.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23559 >
2023-06-13 23:26:40 +00:00
Lionel Landwerlin
810da51e91
anv: report max simd width only once for fragment shaders
...
Reporting the value multiple times is confusing to shader-db scripts
because it believes multiple shaders are affected.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23559 >
2023-06-13 23:26:40 +00:00
Lionel Landwerlin
a0a20164eb
anv: deal with unsupported VkImageFormatListCreateInfo::pViewFormats
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 697ed61e7c ("anv: Improve image/view usage bits verification")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9190
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23606 >
2023-06-13 20:16:46 +00:00
Alyssa Rosenzweig
1d4a59448c
treewide: Remove use_scoped_barrier
...
It is now set by all relevant drivers and not checked anywhere.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23191 >
2023-06-13 16:36:10 +00:00
Tapani Pälli
00a91d8870
anv: use workaround framework for 1408224581, 14014097488
...
This makes sure we apply WA only when it is required, these issues
do not happen for later MTL steppings.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23596 >
2023-06-13 13:27:30 +00:00
Tapani Pälli
15433897b2
intel/dev: add parentheses around intel_needs_workaround macro
...
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23596 >
2023-06-13 13:27:30 +00:00
Jesse Natalie
082eba6165
nir_lower_mem_access_bit_sizes: Move options into a struct
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173 >
2023-06-13 00:43:36 +00:00
Jesse Natalie
4217353e2d
nir_lower_mem_access_bit_sizes: Add a bit_size input to the callback
...
We'd like to use this callback to adjust loads and stores from things
that are unsupported to things that are supported, but if the input
is already supported, we'd prefer not to change it. Rather than making
up a bit size that'd work and doing a bunch of pack/unpack bit math,
only return a different bit size if the input one doesn't work for us
(i.e. can't load enough memory or just an unsupported size entirely).
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173 >
2023-06-13 00:43:36 +00:00