fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-17 20:28:05 +02:00

Author	SHA1	Message	Date
Karol Herbst	202fe3de31	intel/compiler: drop 64 bit handling for cl workgroup intrinsics Signed-off-by: Karol Herbst <git@karolherbst.de> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24905>	2023-08-30 07:04:33 +00:00
Lionel Landwerlin	74a40cc4b6	intel/fs: move lower of non-uniform at_sample barycentric to NIR We use a non-uniform lowering loop in the backend which we can do better in NIR because we can also use divergence analysis there. This change also limits VGRF usage to a single VGRF to hold the sample ID in the backend. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24716>	2023-08-29 23:19:13 +00:00
Lionel Landwerlin	68027bd38e	intel/fs: implement dynamic interpolation mode for dynamic persample shaders There is no restriction for query per sample positions from the interpolator when in non-per-sample dispatch mode. But apparently that's not giving us the expected values for fragment shaders compiled without per-sample dispatch knowledge (graphics pipeline libraries). So when per-sample dispatch is dynamic and we're doing at_sample interpolation, turn the interpolation back into at_offset at runtime when we detect that the fragment shader is not run per sample. Fixes a bunch of dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_sample.* Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `d8dfd153c5` ("intel/fs: Make per-sample and coarse dispatch tri-state") Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24716>	2023-08-29 23:19:13 +00:00
Lionel Landwerlin	9bf2a89127	intel/compiler: fix dynamic alpha-to-coverage handling Got the wrong logic operation. Let's reuse the nicer NIR builder helper. Fixes a bunch of KHR-GL46.sample_variables.mask.rgba8..samples.mask* Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `fd7debc8bb` ("intel/fs: make alpha_to_coverage a tristate") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9568 Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24716>	2023-08-29 23:19:12 +00:00
Lionel Landwerlin	d74c301026	intel/compiler: disable per-sample interpolation modes with non-per-sample dispatch Fixes hangs in dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_sample.* Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `5644011f06` ("intel/compiler: Convert wm_prog_key::persample_interp to a tri-state") Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24716>	2023-08-29 23:19:12 +00:00
Ian Romanick	927a24db14	intel/fs: New VGRF packing scheme for constant combining Each block is processed separately. VGRF channels that are allocated to values that are only used in a particular block are made available in other blocks. This is almost always an improvement, but there are some pessimal cases where it goes horribly wrong. Imagine a shader with two blocks. In that shader, the first block has 5 constants used in the first block and the second block. Three other constants are only used in the first block. The second block has 15 constants that are used only in the block. The static VGRF usage is 3 regardless of packing. However, scheduling may be able to shorten the live range of the first VGRF when it only has values that came from the first block (because three of the values are dead on entry to the second block). This used to occurs in a Mad Max shader on Broadwell. That shader went from 0:0 spills:fills to 107:52. Some changes over the last year, I'm assuming !13734, have prevented this case from occuring. This change created a lot of churn on Haswell and Ivy Bridge. This seems to be primarily due to all the extra constants used for coissue, but I did not investigate very deeply. On older platforms, there were no changes to spills or fills. As a result, this is only used on Broadwell and newer platforms. v2: Update expected checksum for pixmark-piano-v2.trace on gl-zink-anv-tgl. See #9714 for more details. shader-db results: Tiger Lake total instructions in shared programs: 21101332 -> 21102084 (<.01%) instructions in affected programs: 863686 -> 864438 (0.09%) helped: 463 / HURT: 437 total cycles in shared programs: 790573225 -> 790664391 (0.01%) cycles in affected programs: 92546803 -> 92637969 (0.10%) helped: 558 / HURT: 629 total spills in shared programs: 3959 -> 3951 (-0.20%) spills in affected programs: 184 -> 176 (-4.35%) helped: 2 / HURT: 0 total fills in shared programs: 2639 -> 2631 (-0.30%) fills in affected programs: 184 -> 176 (-4.35%) helped: 2 / HURT: 0 LOST: 1 GAINED: 5 Ice Lake and Skylake had similar results. (Ice Lake shown) total instructions in shared programs: 19945216 -> 19944711 (<.01%) instructions in affected programs: 139569 -> 139064 (-0.36%) helped: 66 / HURT: 3 total cycles in shared programs: 858410082 -> 857381323 (-0.12%) cycles in affected programs: 383825958 -> 382797199 (-0.27%) helped: 1012 / HURT: 1055 total spills in shared programs: 6190 -> 6116 (-1.20%) spills in affected programs: 891 -> 817 (-8.31%) helped: 66 / HURT: 3 total fills in shared programs: 7382 -> 7238 (-1.95%) fills in affected programs: 1538 -> 1394 (-9.36%) helped: 66 / HURT: 3 LOST: 5 GAINED: 8 Broadwell total instructions in shared programs: 17820886 -> 17812515 (-0.05%) instructions in affected programs: 800512 -> 792141 (-1.05%) helped: 385 / HURT: 1 total cycles in shared programs: 904482935 -> 903102070 (-0.15%) cycles in affected programs: 422427015 -> 421046150 (-0.33%) helped: 1091 / HURT: 812 total spills in shared programs: 17908 -> 16576 (-7.44%) spills in affected programs: 9459 -> 8127 (-14.08%) helped: 386 / HURT: 0 total fills in shared programs: 25397 -> 22354 (-11.98%) fills in affected programs: 15504 -> 12461 (-19.63%) helped: 385 / HURT: 1 LOST: 2 GAINED: 2 No shader-db changes on Haswell or older platforms. fossil-db results: Tiger Lake Instructions in all programs: 156881463 -> 156890970 (+0.0%) Instructions helped: 9033 Instructions hurt: 10285 Cycles in all programs: 7532597466 -> 7529647924 (-0.0%) Cycles helped: 10548 Cycles hurt: 13667 Spills in all programs: 5490 -> 5110 (-6.9%) Spills helped: 100 Spills hurt: 3 Fills in all programs: 6123 -> 5752 (-6.1%) Fills helped: 100 Fills hurt: 3 Gained: 17 Lost: 47 Ice Lake Instructions in all programs: 141309644 -> 141309603 (-0.0%) Instructions helped: 9 Instructions hurt: 4 Cycles in all programs: 9095812690 -> 9097008049 (+0.0%) Cycles helped: 14288 Cycles hurt: 16381 Spills in all programs: 7418 -> 7404 (-0.2%) Spills helped: 9 Spills hurt: 4 Fills in all programs: 8326 -> 8321 (-0.1%) Fills helped: 9 Fills hurt: 4 Skylake Instructions in all programs: 131872347 -> 131870690 (-0.0%) Instructions helped: 111 Instructions hurt: 3 Cycles in all programs: 8800835649 -> 8802483884 (+0.0%) Cycles helped: 9415 Cycles hurt: 9678 Spills in all programs: 6917 -> 6476 (-6.4%) Spills helped: 111 Spills hurt: 3 Fills in all programs: 7584 -> 7354 (-3.0%) Fills helped: 111 Fills hurt: 3 Lost: 5 Tested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7698>	2023-08-29 19:01:37 +00:00
Ian Romanick	c506d7e511	intel/fs: Combine constants for integer instructions too v2: Remove type change for SHR with negation. This was a leftover from a previous attempt to deal with SHR and negation. Now all right-shifts with unsigned parameters are marked as not being able to have source modifiers. v3: Disallow negations on right shifts of unsigned sources by setting the no_negations flag in add_candidate_immediate. This eliminates the need to exclude SHR in can_do_source_mods. Tiger Lake total instructions in shared programs: 21102817 -> 21099443 (-0.02%) instructions in affected programs: 296796 -> 293422 (-1.14%) helped: 92 / HURT: 356 total cycles in shared programs: 790564691 -> 790393358 (-0.02%) cycles in affected programs: 36456886 -> 36285553 (-0.47%) helped: 171 / HURT: 286 total spills in shared programs: 3951 -> 3959 (0.20%) spills in affected programs: 176 -> 184 (4.55%) helped: 0 / HURT: 2 total fills in shared programs: 2631 -> 2639 (0.30%) fills in affected programs: 176 -> 184 (4.55%) helped: 0 / HURT: 2 LOST: 0 GAINED: 4 Ice Lake total instructions in shared programs: 19954204 -> 19949122 (-0.03%) instructions in affected programs: 40301 -> 35219 (-12.61%) helped: 23 / HURT: 2 total cycles in shared programs: 858377735 -> 858462082 (<.01%) cycles in affected programs: 75537286 -> 75621633 (0.11%) helped: 124 / HURT: 319 total spills in shared programs: 6255 -> 6190 (-1.04%) spills in affected programs: 392 -> 327 (-16.58%) helped: 1 / HURT: 2 total fills in shared programs: 7813 -> 7382 (-5.52%) fills in affected programs: 942 -> 511 (-45.75%) helped: 1 / HURT: 2 LOST: 0 GAINED: 3 Skylake total instructions in shared programs: 18049362 -> 18044440 (-0.03%) instructions in affected programs: 48317 -> 43395 (-10.19%) helped: 26 / HURT: 2 total cycles in shared programs: 844884806 -> 844915655 (<.01%) cycles in affected programs: 76137133 -> 76167982 (0.04%) helped: 171 / HURT: 293 total spills in shared programs: 6148 -> 6149 (0.02%) spills in affected programs: 595 -> 596 (0.17%) helped: 4 / HURT: 2 total fills in shared programs: 7484 -> 7067 (-5.57%) fills in affected programs: 1226 -> 809 (-34.01%) helped: 4 / HURT: 2 LOST: 0 GAINED: 8 Broadwell total instructions in shared programs: 17826844 -> 17821805 (-0.03%) instructions in affected programs: 60687 -> 55648 (-8.30%) helped: 28 / HURT: 8 total cycles in shared programs: 905332682 -> 904369499 (-0.11%) cycles in affected programs: 76743509 -> 75780326 (-1.26%) helped: 179 / HURT: 225 total spills in shared programs: 17922 -> 17908 (-0.08%) spills in affected programs: 2495 -> 2481 (-0.56%) helped: 6 / HURT: 8 total fills in shared programs: 26290 -> 25397 (-3.40%) fills in affected programs: 2606 -> 1713 (-34.27%) helped: 8 / HURT: 6 LOST: 1 GAINED: 1 Haswell total instructions in shared programs: 16678878 -> 16674444 (-0.03%) instructions in affected programs: 78458 -> 74024 (-5.65%) helped: 87 / HURT: 6 total cycles in shared programs: 880189381 -> 880301043 (0.01%) cycles in affected programs: 29956463 -> 30068125 (0.37%) helped: 169 / HURT: 163 total spills in shared programs: 14428 -> 14378 (-0.35%) spills in affected programs: 2384 -> 2334 (-2.10%) helped: 8 / HURT: 6 total fills in shared programs: 16975 -> 16881 (-0.55%) fills in affected programs: 1334 -> 1240 (-7.05%) helped: 10 / HURT: 4 Ivy Bridge total instructions in shared programs: 15706048 -> 15706035 (<.01%) instructions in affected programs: 9941 -> 9928 (-0.13%) helped: 13 / HURT: 0 total cycles in shared programs: 433618834 -> 433624637 (<.01%) cycles in affected programs: 12926714 -> 12932517 (0.04%) helped: 52 / HURT: 41 Sandy Bridge total cycles in shared programs: 741223552 -> 741223443 (<.01%) cycles in affected programs: 19814 -> 19705 (-0.55%) helped: 14 / HURT: 0 No changes on Iron Lake or GM45 fossil-db changes: Tiger Lake Instructions in all programs: 156858030 -> 156905532 (+0.0%) Instructions helped: 3915 Instructions hurt: 15411 Cycles in all programs: 7529667771 -> 7532117340 (+0.0%) Cycles helped: 10260 Cycles hurt: 9990 Spills in all programs: 5610 -> 5457 (-2.7%) Spills helped: 18 Fills in all programs: 6274 -> 6091 (-2.9%) Fills helped: 18 Gained: 2 Lost: 16 Ice Lake Instructions in all programs: 141308082 -> 141303083 (-0.0%) Instructions helped: 574 Instructions hurt: 172 Cycles in all programs: 9091361325 -> 9094622766 (+0.0%) Cycles helped: 8764 Cycles hurt: 11702 Spills in all programs: 7531 -> 7385 (-1.9%) Spills helped: 19 Fills in all programs: 8462 -> 8294 (-2.0%) Fills helped: 19 Gained: 22 Lost: 15 Skylake Instructions in all programs: 131872162 -> 131867263 (-0.0%) Instructions helped: 566 Instructions hurt: 172 Cycles in all programs: 8795095440 -> 8799676943 (+0.1%) Cycles helped: 8333 Cycles hurt: 12182 Spills in all programs: 7006 -> 6884 (-1.7%) Spills helped: 13 Fills in all programs: 7696 -> 7552 (-1.9%) Fills helped: 13 Gained: 24 Lost: 1 Tested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7698>	2023-08-29 19:01:36 +00:00
Ian Romanick	64c251bb3a	intel/fs: Combine constants for SEL instructions too It is very common to have bcsel where the second and third sources are both constants. This results in a situation where we would want to emit a SEL with two constant sources, but that's not allowed. Previously, we would load both constants into registers, then let constant propagation copy the last constant into the SEL instruction. This results in the constant using an entire SIMD register instead of a single channel. Instead, copy propagate both sources, then let the combine-constants pass do its thing. In the worst case, this stores the constant in a single channel of the SIMD register. In the best case, it reuses a value that was loaded into a register to satisfy another instruction. shader-db results: Tiger Lake, Ice Lake, and Skylake had similar results. (Ice Lake shown) total instructions in shared programs: 19951549 -> 19948709 (-0.01%) instructions in affected programs: 482795 -> 479955 (-0.59%) helped: 1184 / HURT: 3 total cycles in shared programs: 858584724 -> 858205341 (-0.04%) cycles in affected programs: 356168375 -> 355788992 (-0.11%) helped: 1448 / HURT: 1195 total spills in shared programs: 6569 -> 6255 (-4.78%) spills in affected programs: 912 -> 598 (-34.43%) helped: 58 / HURT: 0 total fills in shared programs: 8218 -> 7813 (-4.93%) fills in affected programs: 1570 -> 1165 (-25.80%) helped: 58 / HURT: 0 LOST: 6 GAINED: 16 Broadwell total instructions in shared programs: 17819660 -> 17819389 (<.01%) instructions in affected programs: 1078129 -> 1077858 (-0.03%) helped: 1067 / HURT: 304 total cycles in shared programs: 904722624 -> 905035016 (0.03%) cycles in affected programs: 362583117 -> 362895509 (0.09%) helped: 1381 / HURT: 1123 total spills in shared programs: 17884 -> 17922 (0.21%) spills in affected programs: 5088 -> 5126 (0.75%) helped: 55 / HURT: 152 total fills in shared programs: 25533 -> 26290 (2.96%) fills in affected programs: 12992 -> 13749 (5.83%) helped: 61 /HURT: 295 LOST: 7 GAINED: 24 Haswell total instructions in shared programs: 16678080 -> 16673976 (-0.02%) instructions in affected programs: 1162893 -> 1158789 (-0.35%) helped: 1584 / HURT: 7 total cycles in shared programs: 880180082 -> 879932525 (-0.03%) cycles in affected programs: 364067522 -> 363819965 (-0.07%) helped: 1226 / HURT: 976 total spills in shared programs: 14937 -> 14428 (-3.41%) spills in affected programs: 7866 -> 7357 (-6.47%) helped: 351 / HURT: 5 total fills in shared programs: 17572 -> 16975 (-3.40%) fills in affected programs: 11028 -> 10431 (-5.41%) helped: 350 / HURT: 3 LOST: 8 GAINED: 16 Ivy Bridge total instructions in shared programs: 15704044 -> 15703158 (<.01%) instructions in affected programs: 304513 -> 303627 (-0.29%) helped: 707 / HURT: 0 total cycles in shared programs: 433560149 -> 433471118 (-0.02%) cycles in affected programs: 19299650 -> 19210619 (-0.46%) helped: 687 / HURT: 395 LOST: 2 GAINED: 9 Sandy Bridge total instructions in shared programs: 13913386 -> 13912884 (<.01%) instructions in affected programs: 195687 -> 195185 (-0.26%) helped: 455 / HURT: 0 total cycles in shared programs: 741156272 -> 741136266 (<.01%) cycles in affected programs: 10934349 -> 10914343 (-0.18%) helped: 578 / HURT: 289 LOST: 9 GAINED: 4 Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8364056 -> 8364042 (<.01%) instructions in affected programs: 5178 -> 5164 (-0.27%) helped: 10 / HURT: 0 total cycles in shared programs: 248759794 -> 248757940 (<.01%) cycles in affected programs: 4305246 -> 4303392 (-0.04%) helped: 183 / HURT: 24 fossil-db results: Tiger Lake Instructions in all programs: 156943594 -> 156802601 (-0.1%) Instructions helped: 20595 Instructions hurt: 23248 Cycles in all programs: 7512086950 -> 7528386387 (+0.2%) Cycles helped: 29531 Cycles hurt: 27837 Spills in all programs: 13500 -> 5643 (-58.2%) Spills helped: 394 Spills hurt: 22 Fills in all programs: 18943 -> 6306 (-66.7%) Fills helped: 394 Fills hurt: 11 Gained: 93 Lost: 76 Ice Lake Instructions in all programs: 141395899 -> 141249621 (-0.1%) Instructions helped: 30067 Instructions hurt: 3 Cycles in all programs: 9097127057 -> 9089668235 (-0.1%) Cycles helped: 32268 Cycles hurt: 24315 Spills in all programs: 13695 -> 7564 (-44.8%) Spills helped: 403 Fills in all programs: 18400 -> 8494 (-53.8%) Fills helped: 403 Gained: 114 Lost: 137 Skylake Instructions in all programs: 131948328 -> 131826063 (-0.1%) Instructions helped: 29968 Instructions hurt: 3 Cycles in all programs: 8794778440 -> 8793934844 (-0.0%) Cycles helped: 32705 Cycles hurt: 23575 Spills in all programs: 10526 -> 7039 (-33.1%) Spills helped: 403 Fills in all programs: 11025 -> 7728 (-29.9%) Fills helped: 403 Gained: 102 Lost: 250 Tested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7698>	2023-08-29 19:01:36 +00:00
Ian Romanick	44d62a5224	intel/fs: Completely re-write the combine constants pass The is a squash of what in the original MR was "util: Add generic pass that tries to combine constants" and "intel/fs: Switch to using util_combine_constants". The new algorithm uses a multi-pass greedy algorithm that attempts to collect constants for loading in order of increasing degrees of freedom. The first pass collects constants that must be emitted as-is (e.g., without source modifiers). The second pass emits all constants that must be emitted (because they are used in a source field that cannot be a literal constant) but that can have a source modifier. The final pass possibly emits constants that may not have to be emitted. This is used for instructions where one of the fields is allowed to be a constant. This is not used in the current commit, but future commits that enable SEL will use this. The SEL instruction can have a single constant, but when both sources are constant, one of the sources has to be loaded into a register. By loading constants in this order, required "choices" made in earlier passes may be re-used in later passes. This provides a more optimal result. At this point in the series, most platforms have the same results with the new implementation. Gen7 platforms see a significant number of "small" changes. Due to the coissue optimization on Gen7, each shader is likely to have most constants affected by constant combining. If a shader has only a single basic block, constants are packed into registers in the order produced by the constant combining process. Since each constant has a different live range in the shader, even slightly different packing orders can have dramatic effects on the live range of a register. Even in cases where this does not affect register pressure in a meaningful way, it can cause the scheduler to make very different choices about the ordering of instructions. From my analysis (using the `if (debug) { ... }` block at the end of fs_visitor::opt_combine_constants), the old implementation and the new implementation pick the same set of constants, but the order produced may be slightly different. For the smaller number of values in non-Gfx7 shaders, the orders are similar enough to not matter. No shader-db or fossil-db changes on any non-Gfx7 platforms. Haswell and Ivy Bridge had similar results. (Haswell shown) total cycles in shared programs: 879930036 -> 880001666 (<.01%) cycles in affected programs: 22485040 -> 22556670 (0.32%) helped: 1879 HURT: 2309 helped stats (abs) min: 1 max: 6296 x̄: 258.54 x̃: 34 helped stats (rel) min: <.01% max: 54.63% x̄: 3.88% x̃: 0.87% HURT stats (abs) min: 1 max: 9739 x̄: 241.41 x̃: 40 HURT stats (rel) min: <.01% max: 160.50% x̄: 6.01% x̃: 0.99% 95% mean confidence interval for cycles value: -1.04 35.25 95% mean confidence interval for cycles %-change: 1.23% 1.92% Inconclusive result (value mean confidence interval includes 0). LOST: 82 GAINED: 39 Tested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7698>	2023-08-29 19:01:36 +00:00
Caio Oliveira	58c7ad6ace	hasvk/tests: Propagate failures to gtest Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24355>	2023-08-25 12:08:29 -07:00
Caio Oliveira	27a66f70a5	hasvk/tests: Link a single hasvk_tests binary using gtest Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24355>	2023-08-25 12:08:29 -07:00
Caio Oliveira	66d3b4a8b2	hasvk/tests: Refactor state_pool_test_helper to not use macros for parametrization Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24355>	2023-08-25 12:08:29 -07:00
Caio Oliveira	54b0745b5e	anv/tests: Propagate failures to gtest Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24355>	2023-08-25 12:08:26 -07:00
Caio Oliveira	c374033f5b	anv/tests: Link a single anv_tests binary using gtest Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24355>	2023-08-25 12:08:26 -07:00
Caio Oliveira	695e356d4a	anv/tests: Refactor state_pool_test_helper to not use macros for parametrization Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24355>	2023-08-25 12:08:26 -07:00
Benjamin Cheng	f64f08a9e0	anv/video: send h264 scaling list in raster order ITU spec defines the H264 ScalingList{4x4,8x8} in zig-zag order, but Intel HW wants raster order. Reviewed-by: Lynne <dev@lynne.ee> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24572>	2023-08-25 03:08:13 +00:00
Benjamin Cheng	e921b889e3	anv/video: use vk_video_derive_h264_scaling_list Reviewed-by: Hyunjun Ko <zzoon@igalia.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Lynne <dev@lynne.ee> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24572>	2023-08-25 03:08:13 +00:00
David Heidelberg	e51056f9f7	ci/iris: add GL46.arrays_of_arrays_gl.SizedDeclarationsPrimitive timeout Signed-off-by: David Heidelberg <david.heidelberg@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24870>	2023-08-24 17:06:42 +00:00
Alyssa Rosenzweig	cda1961835	treewide: Also handle struct nir_builder form Via Coccinelle patch: @def@ typedef bool; typedef nir_builder; typedef nir_instr; typedef nir_def; identifier fn, instr, intr, x, builder, data; @@ static fn(struct nir_builder* builder, -nir_instr instr, +nir_intrinsic_instr intr, ...) { ( - if (instr->type != nir_instr_type_intrinsic) - return false; - nir_intrinsic_instr intr = nir_instr_as_intrinsic(instr); \| - nir_intrinsic_instr intr = nir_instr_as_intrinsic(instr); - if (instr->type != nir_instr_type_intrinsic) - return false; ) <... ( -instr->x +intr->instr.x \| -instr +&intr->instr ) ...> } @pass depends on def@ identifier def.fn; expression shader, progress; @@ ( -nir_shader_instructions_pass(shader, fn, +nir_shader_intrinsics_pass(shader, fn, ...) \| -NIR_PASS_V(shader, nir_shader_instructions_pass, fn, +NIR_PASS_V(shader, nir_shader_intrinsics_pass, fn, ...) \| -NIR_PASS(progress, shader, nir_shader_instructions_pass, fn, +NIR_PASS(progress, shader, nir_shader_intrinsics_pass, fn, ...) ) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24852>	2023-08-24 15:48:02 +00:00
Alyssa Rosenzweig	465b138f01	treewide: Use nir_shader_intrinsic_pass sometimes This converts a lot of trivial passes. Nice boilerplate deletion. Via Coccinelle patch (with a small manual fix-up for panfrost where coccinelle got confused by genxml + ninja clang-format squashed in, and for Zink because my semantic patch was slightly buggy). @def@ typedef bool; typedef nir_builder; typedef nir_instr; typedef nir_def; identifier fn, instr, intr, x, builder, data; @@ static fn(nir_builder* builder, -nir_instr instr, +nir_intrinsic_instr intr, ...) { ( - if (instr->type != nir_instr_type_intrinsic) - return false; - nir_intrinsic_instr intr = nir_instr_as_intrinsic(instr); \| - nir_intrinsic_instr intr = nir_instr_as_intrinsic(instr); - if (instr->type != nir_instr_type_intrinsic) - return false; ) <... ( -instr->x +intr->instr.x \| -instr +&intr->instr ) ...> } @pass depends on def@ identifier def.fn; expression shader, progress; @@ ( -nir_shader_instructions_pass(shader, fn, +nir_shader_intrinsics_pass(shader, fn, ...) \| -NIR_PASS_V(shader, nir_shader_instructions_pass, fn, +NIR_PASS_V(shader, nir_shader_intrinsics_pass, fn, ...) \| -NIR_PASS(progress, shader, nir_shader_instructions_pass, fn, +NIR_PASS(progress, shader, nir_shader_intrinsics_pass, fn, ...) ) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24852>	2023-08-24 15:48:02 +00:00
Mauro Rossi	ef6725a5f4	hasvk/android: remove numFds check Change required for compatibility with minigbm gralloc4 due to gralloc handle having DRV_MAX_FDS = (DRV_MAX_PLANES + 1) https://android.googlesource.com/platform/external/minigbm/+/refs/tags/android-13.0.0_r18/cros_gralloc/cros_gralloc_handle.h#14 Cc: "22.3" mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7807 Reviewed-by: Roman Stratiienko <r.stratiienko@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20231>	2023-08-24 11:07:12 +00:00
Mauro Rossi	143d417fcc	anv/android: remove numFds check Change required for compatibility with minigbm gralloc4 due to gralloc handle having DRV_MAX_FDS = (DRV_MAX_PLANES + 1) https://android.googlesource.com/platform/external/minigbm/+/refs/tags/android-13.0.0_r18/cros_gralloc/cros_gralloc_handle.h#14 Cc: "22.2" "22.3" mesa-stable Reviewed-by: Roman Stratiienko <r.stratiienko@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20231>	2023-08-24 11:07:12 +00:00
Chris Spencer	6a4e9b55e4	anv: Don't reject Android image format if external props not supplied anv_GetPhysicalDeviceImageFormatProperties2 returns 'not supported' if an Android hardware buffer external memory handle type is specified, but no external image format properties output struct is supplied. This struct is optional, so we should populate it if present, but return successfully either way. This fixes an error when using ANV with hwui, which otherwise prevents the system from booting.[1] [1] https://cs.android.com/android/platform/superproject/main/+/main:frameworks/base/libs/hwui/renderthread/VulkanSurface.cpp;l=271;drc=ad3fb95aa2fe0be59d3e991ddc883592ab5542bc Signed-off-by: Chris Spencer <spencercw@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24844>	2023-08-24 10:26:09 +00:00
Yonggang Luo	0b84e38684	intel/brw: use 4 instead of MAX_VERTEX_STREAMS to avoid #include "mesa/main/config.h" Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24824>	2023-08-24 02:54:08 +00:00
Kenneth Graunke	08fc4603dd	intel/fs: Dump IR for pre-RA scheduler modes in DEBUG_OPTIMIZER This lets us more easily compare and contrast the various scheduling options that the compiler considered. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24707>	2023-08-23 21:34:38 +00:00
Kenneth Graunke	07f2ad32e4	intel/fs: Pick the lowest register pressure schedule when spilling We try various pre-RA scheduler modes and see if any of them allow us to register allocate without spilling. If all of them spill, however, we left it on the last mode: LIFO. This is unfortunately sometimes significantly worse than other modes (such as "none"). This patch makes us instead select the pre-RA scheduling mode that gives the lowest register pressure estimate, if none of them manage to avoid spilling. The hope is that this scheduling will spill the least out of all of them. fossil-db stats (on Alchemist) speak for themselves: Totals: Instrs: 197297092 -> 195326552 (-1.00%); split: -1.02%, +0.03% Cycles: 14291286956 -> 14303502596 (+0.09%); split: -0.55%, +0.64% Spill count: 190886 -> 129204 (-32.31%); split: -33.01%, +0.70% Fill count: 361408 -> 225038 (-37.73%); split: -39.17%, +1.43% Scratch Memory Size: 12935168 -> 10868736 (-15.98%); split: -16.08%, +0.10% Totals from 1791 (0.27% of 668386) affected shaders: Instrs: 7628929 -> 5658389 (-25.83%); split: -26.50%, +0.67% Cycles: 719326691 -> 731542331 (+1.70%); split: -10.95%, +12.65% Spill count: 110627 -> 48945 (-55.76%); split: -56.96%, +1.20% Fill count: 221560 -> 85190 (-61.55%); split: -63.89%, +2.34% Scratch Memory Size: 4471808 -> 2405376 (-46.21%); split: -46.51%, +0.30% Improves performance when using XeSS in Cyberpunk 2077 by 90% on A770. Improves performance of Borderlands 3 by 1.54% on A770. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24707>	2023-08-23 21:34:38 +00:00
Kenneth Graunke	158ac265df	intel/fs: Make helpers for saving/restoring instruction order This moves a bit of code out of a large function, but also lets us reuse it a few extra places in the next commit. I opted to stop using ralloc here since this is short-lived data that doesn't need to stick around for the rest of the compile, and it's easy enough to free. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24707>	2023-08-23 21:34:38 +00:00
Kenneth Graunke	2dd56921c9	intel/fs: Index scheduler mode string table by mode enum pre_modes[] is an array with the modes ordered in our desired preference. scheduler_mode_name[] was also in that order, and the two had to be kept in sync. This is a little silly; we should just have a mode enum -> string table and look it up via the enum. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24707>	2023-08-23 21:34:38 +00:00
Kenneth Graunke	7eba19245d	intel/compiler: Move SCHEDULE_NONE handling into schedule_instructions() I'm going to introduce another call site for this function, and just handling SCHEDULE_NONE in the scheduler itself makes more sense than duplicating the logic. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24707>	2023-08-23 21:34:38 +00:00
Kenneth Graunke	743fd60bea	intel/fs: Account for payload GRFs when calculating register pressure The register pressure analysis I wrote in 2013 only considered VGRFs, and not other GRFs, such as payload registers and push constants. We need to consider those too, because payload registers definitely occupy space and add to pressure. In 2015, Connor already made the scheduler account for this, so the only real use for this is in shader statistic dumps and optimizer printouts. But we should make it more accurate. (We will use it in more places shortly, a few commits from now.) Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24707>	2023-08-23 21:34:38 +00:00
Chris Spencer	bda4eb18dd	anv: Advertise Vulkan 1.3 on Android 13 Older versions of Android rejected newer versions of Vulkan,[1] but Android 13 devices are 'strongly recommended' to support Vulkan 1.3.[2] [1] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4781 [2] https://source.android.com/docs/compatibility/13/android-13-cdd#7142_vulkan Signed-off-by: Chris Spencer <spencercw@gmail.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24816>	2023-08-23 14:31:26 +00:00
Sviatoslav Peleshko	9865e5dff4	anv: Do fast clear color initialization more delicately Fixes: `b4198e79` ("anv/cmd_buffer: Initalize the clear color struct for CNL+") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9464 Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24768>	2023-08-23 12:55:08 +00:00
Sviatoslav Peleshko	caa5c23e48	intel/isl: Don't over-allocate CLEAR_COLOR size to use whole cache line At the time this was added to fix some test failures. But it seems that the failures were happening due to missing cache flushes, so this extra space is no longer neccessary. Fixes: 37b4eacc ("intel/isl: Resize clear color buffer to full cacheline") Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24768>	2023-08-23 12:55:08 +00:00
Chris Spencer	280281f8f7	anv/android: Add support for AHARDWAREBUFFER_FORMAT_YV12 The default MediaCodec software video decoder returns frames in this format. Signed-off-by: Chris Spencer <spencercw@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24388>	2023-08-23 09:56:03 +00:00
Chris Spencer	35fddccf3f	anv/android: Fix importing hardware buffers with planar formats Currently, we try to fetch the color aspect of the format and convert that to an ISL format, which is then used to convert the pixel stride to bytes. This does not work with planar formats because they don't have a color aspect, and the planes can be of different sizes anyway, so may not have the same byte stride. Change to calculate the stride individually for each plane. Signed-off-by: Chris Spencer <spencercw@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24388>	2023-08-23 09:56:03 +00:00
Sagar Ghuge	839b03cc06	blorp: Drop unnecessary assertions in blorp_can_hiz_clear_depth We already checks for the alignment and the multislice surface, we don't need to add assertions around those two. fixes: `37fcbb375c` ("blorp: Disable unaligned partial HIZ fast clears for HIZ_CCS too") closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9684 Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Tested-by: Mark Janes <markjanes@swizzler.org> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24837>	2023-08-23 00:35:07 +00:00
Emma Anholt	5bd0750921	intel/fs: Simplify compute_start_end(). Now that we have moved the screening up, we can simplify the code. No change in shader-db steam performance, n=10. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24702>	2023-08-22 23:34:30 +00:00
Emma Anholt	2b01246f49	intel/fs: Move the defin[]/defout[] screening up to livein[]/liveout[] setup. This keeps us from having to run the loop to propagate up quite so much. steam shader-db time -1.86356% +/- 0.941498% (n=10). There's a small scheduling effect, since previously the scheduler wasn't considering defin/defout: cycles helped: shaders/closed/steam/amnesia-the-dark-descent/high/241.shader_test FS SIMD16: 11428 -> 11422 (-0.05%) (scheduled: scheduled) cycles helped: shaders/humus-volumetricfogging2/1.shader_test FS SIMD32: 13832 -> 13800 (-0.23%) (scheduled: scheduled) cycles helped: shaders/tesseract/479.shader_test FS SIMD32: 9330 -> 8644 (-7.35%) (scheduled: scheduled) cycles HURT: shaders/robclark-shaders/android/angle/aztec_ruins/36.shader_test FS SIMD32: 7870 -> 7940 (0.89%) (scheduled: scheduled) cycles HURT: shaders/robclark-shaders/gfxbench5/gl_5_high_off/57.shader_test FS SIMD32: 7870 -> 7940 (0.89%) (scheduled: scheduled) cycles HURT: shaders/robclark-shaders/gfxbench5/gl_5_normal_off/54.shader_test FS SIMD32: 7870 -> 7940 (0.89%) (scheduled: scheduled) cycles HURT: shaders/robclark-shaders/android/angle/aztec_ruins/30.shader_test FS SIMD32: 8726 -> 8808 (0.94%) (scheduled: scheduled) cycles HURT: shaders/robclark-shaders/gfxbench5/gl_5_high_off/51.shader_test FS SIMD32: 8726 -> 8808 (0.94%) (scheduled: scheduled) cycles HURT: shaders/robclark-shaders/gfxbench5/gl_5_normal_off/48.shader_test FS SIMD32: 8726 -> 8808 (0.94%) (scheduled: scheduled) cycles HURT: shaders/robclark-shaders/gfxbench5/gl_4_off/129.shader_test TCS SIMD8: 3911 -> 3979 (1.74%) (scheduled: scheduled) cycles HURT: shaders/robclark-shaders/gfxbench5/gl_4_off/109.shader_test TCS SIMD8: 3911 -> 3979 (1.74%) (scheduled: scheduled) total cycles in shared programs: 313096438 -> 313096306 (<.01%) cycles in affected programs: 92200 -> 92068 (-0.14%) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24702>	2023-08-22 23:34:30 +00:00
Emma Anholt	ed4e1becea	intel/fs: Move defin/defout setup to the start of the loop. Refactor for the next commit. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24702>	2023-08-22 23:34:30 +00:00
Eric Engestrom	566c919df8	ci/deqp: backport fix for dEQP-EGL.functional.wide_color._888_colorspace_ Signed-off-by: Eric Engestrom <eric@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24808>	2023-08-22 18:12:08 +00:00
Emma Anholt	37fcbb375c	blorp: Disable unaligned partial HIZ fast clears for HIZ_CCS too. Fixes MSAA scissored fast clears under zink and ANGLE. Fixes: `e488773b29` ("anv: Fast clear depth/stencil surface in vkCmdClearAttachments") Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24225>	2023-08-22 16:34:52 +00:00
Tapani Pälli	c9abcddad4	anv: implement a dummy depth flush for Wa_14016712196 Emit depth flush after state that sends implicit depth flush. These states are: 3DSTATE_HIER_DEPTH_BUFFER 3DSTATE_STENCIL_BUFFER 3DSTATE_DEPTH_BUFFER 3DSTATE_CPSIZE_CONTROL_BUFFER Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24689>	2023-08-22 12:49:37 +00:00
Georg Lehmann	9cf6984200	nir: unify lower_find_msb with has_{find_msb_rev,uclz} Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24662>	2023-08-22 12:08:37 +00:00
Georg Lehmann	2ac7e6614a	nir: unify lower_bitfield_extract with has_bfe Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24662>	2023-08-22 12:08:37 +00:00
Georg Lehmann	34c3f81614	nir: unify lower_bitfield_insert with has_{bfm,bfi,bitfield_select} Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24662>	2023-08-22 12:08:37 +00:00
José Roberto de Souza	a425ae17ac	anv: Update Wa_16014390852 for MTL On MTL Wa_16014390852 is fixed on B0 stepping so we can't use a macro check anymore for this workaround. cc: mesa-stable Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24812>	2023-08-22 06:33:56 +00:00
David Heidelberg	6079c3ca49	ci: disable Material Testers.x86_64_2020.04.08_13.38_frame799.rdc trace This change will be revert as soon, as Collabora proxy gets fixed. Signed-off-by: David Heidelberg <david.heidelberg@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24819>	2023-08-21 22:31:21 +00:00
Rohan Garg	8849e1e3a6	anv: emitting 3DSTATE_PRIMITIVE_REPLICATION is required on Gen12+ This change helps fix the following tests on future platforms: - func.multiview - dEQP-VK.fragment_shading_rate.renderpass2.monolithic.multiviewsrlayered.dynamic.attachment.noshaderrate.keep.replace.1x1.samples1.vs - anything else that uses multiview Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24746>	2023-08-18 11:36:45 +00:00
Faith Ekstrand	b5d6b7c402	nir: Drop most uses if nir_instr_rewrite_src() Generated by the following semantic patch: @@ expression I, S, D; @@ -nir_instr_rewrite_src(I, S, nir_src_for_ssa(D)); +nir_src_rewrite(S, D); Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24729>	2023-08-18 01:00:15 +00:00
Faith Ekstrand	de063a1481	nir: Drop most uses of nir_instr_rewrite_src_ssa() Generated with the following semantic patch: @@ expression I, S, D; @@ -nir_instr_rewrite_src_ssa(I, S, D); +nir_src_rewrite(S, D); Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24729>	2023-08-18 01:00:15 +00:00

1 2 3 4 5 ...

10066 commits