fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-19 09:18:10 +02:00

Author	SHA1	Message	Date
Simon Ser	35e25ea1d0	ac/surface: allow non-DCC modifiers for YUV on GFX9+ Accept non-linear tiling for multi-planar formats on GFX9+, as long as DCC is disabled. DCC support is possible in theory, but untested for now. GFX8 is still restricted to linear tiling because it's not yet clear how modifiers should be handled on these chips for multi-planar formats. Each plane may need a different modifier. Signed-off-by: Simon Ser <contact@emersion.fr> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Leo Liu <leo.liu@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10134>	2021-04-15 09:43:17 +00:00
Simon Ser	19378dfe3c	ac/surface: use blocksizebits instead of blocksize util_format_get_blocksize asserts that the blocksize isn't zero. However the blocksize will be zero if the format's channel encoding is unspecified. The channel encoding is only meaningful for the plain u_format layout, so util_format_get_blocksize can't be used for formats with another layout. For example, YUV formats don't have the channel encoding specified. Use util_format_get_blocksizebits, which just returns zero without an assertion for formats which don't have a channel encoding. Signed-off-by: Simon Ser <contact@emersion.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10134>	2021-04-15 09:43:17 +00:00
Rhys Perry	5b8a4516e6	aco/ra: remove live-in temporary from live_out_per_block when moving it Otherwise, handle_loop_phis() might pass it to handle_live_in() and then we could have two phis for this variable. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `7c64623e94` ("aco/ra: refactor SSA repairing during register allocation") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10236>	2021-04-14 19:04:08 +00:00
Rhys Perry	11fde1247c	aco/ra: use original names when renaming loop carried phi operands Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `7c64623e94` ("aco/ra: refactor SSA repairing during register allocation") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10236>	2021-04-14 19:04:08 +00:00
Samuel Pitoiset	97e7b21c42	ac: add missing BUF_DATA_FORMAT_10_11_11 vertex format on GFX10+ This format is supported by the driver. Fixes vertex explosion in Dirt 5. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4635 Cc: mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10226>	2021-04-14 18:07:41 +00:00
Bas Nieuwenhuizen	8ddbac0377	radv/winsys: Remove use_local_bos Now that perftest is stored in the winsys. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10198>	2021-04-14 15:16:17 +00:00
Bas Nieuwenhuizen	284bc57a49	radv: Use VRAM cmdbuffers in more situations. In most games I tested we use 32 MiB of cmdbuffers+cmd upload buffers at most. Especially since we have mutable descriptors it seems somewhat unlikely anything else will eat it up so be a bit more aggressive allocating them in VRAM. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10198>	2021-04-14 15:16:17 +00:00
Bas Nieuwenhuizen	057ec395a4	radv: Refactor cs_domain to be a winsys function. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10198>	2021-04-14 15:16:17 +00:00
Timur Kristóf	f3e004cb56	aco: Add a simple heuristic to decide early or late primitive export. Late export is theoretically better if used with LATE_ALLOC, but in practice, the early export has an advantage of lower register usage, therefore more concurrent waves. The idea of this commit is that "small" shaders benefit from early primitive export more, due to being able to launch much more waves. Let's consider a NIR shader "small" when it has only 1 block. This yields both better performance, and better stats, than always using late export. Fossil DB on Sienna: Totals from 12807 (8.76% of 146265) affected shaders: VGPRs: 609128 -> 620216 (+1.82%); split: -0.01%, +1.83% SpillSGPRs: 1458 -> 1538 (+5.49%) CodeSize: 37028204 -> 37019320 (-0.02%); split: -0.17%, +0.14% MaxWaves: 282902 -> 278516 (-1.55%) Instrs: 7163142 -> 7162925 (-0.00%); split: -0.18%, +0.18% VClause: 169285 -> 169547 (+0.15%); split: -1.15%, +1.30% SClause: 267373 -> 267151 (-0.08%); split: -0.24%, +0.16% Copies: 446442 -> 444567 (-0.42%); split: -2.68%, +2.26% Branches: 156245 -> 156195 (-0.03%); split: -0.30%, +0.26% PreSGPRs: 434701 -> 447396 (+2.92%) PreVGPRs: 527783 -> 540527 (+2.41%) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106>	2021-04-14 14:25:10 +00:00
Timur Kristóf	5dbab03a80	aco: Emit fewer branches for NGG VS/TES with late primitive export. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106>	2021-04-14 14:25:10 +00:00
Timur Kristóf	af7d5f5b86	aco: Set block_kind_export_end in create_vs/fs_exports. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106>	2021-04-14 14:25:10 +00:00
Timur Kristóf	2b312a4fd7	aco: Extract ngg_nogs_export_prim_id to a separate function. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106>	2021-04-14 14:25:10 +00:00
Timur Kristóf	231ef14b3d	aco: Use s_setprio 3 at the beginning of every VS and TES. The user-set priority of shaders matters very little, but we hope this might still help speed up VS input loads especially. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106>	2021-04-14 14:25:10 +00:00
Timur Kristóf	4c86c7aa15	aco: Remove useless s_setprio near gs_alloc_req. We learned that the gs_alloc_req is not actually when the export space allocation happens. So it makes no sense to prioritize it. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106>	2021-04-14 14:25:10 +00:00
Timur Kristóf	75cd43741a	aco: Align NGG scratch size to 16 so a single ds_read can always read it. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10155>	2021-04-14 14:05:24 +00:00
Timur Kristóf	c1346e5c22	aco: Optimize workgroup exclusive scan to better avoid bank conflicts. Previously, every wave had multiple active lanes read the LDS, and the data was processed by VALU DPP instructions. Now, only the first lane reads the LDS in order to avoid bank conflicts, and the results are processed by SALU. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10155>	2021-04-14 14:05:24 +00:00
Rhys Perry	e3ebc1ca4b	radv: fix conditions for running nir_opt_vectorize No fossil-db changes, probably because all fp16 shaders have at least one 16-bit mov or vec2 somehwere. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10227>	2021-04-14 12:27:06 +00:00
Samuel Pitoiset	e24049da63	radv: advertise attachmentFragmentShadingRate on GFX10.3 Layered VRS attachments is for later. The CTS failures are similar to the existing ones, I will investigate soon. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10187>	2021-04-14 09:31:13 +00:00
Samuel Pitoiset	ee77dde396	radv: configure the VRS combiners when an attachment is used Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10187>	2021-04-14 09:31:13 +00:00
Samuel Pitoiset	ea370f45b4	radv: copy VRS rates to HTILE when beginning a subpass The global VRS image is created on-demand to avoid wasting space. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10187>	2021-04-14 09:31:13 +00:00
Samuel Pitoiset	0b7e346203	radv: add support for copying VRS rates into HTILE Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10187>	2021-04-14 09:31:13 +00:00
Samuel Pitoiset	164b1884c0	radv: bind our internal depth buffer when not provided by the app When a subpass uses a VRS attachment without binding a depth/stencil attachment (yes, this is allowed by the Vulkan spec), we have to bind our internal depth buffer that contains the VRS data. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10187>	2021-04-14 09:31:13 +00:00
Samuel Pitoiset	513a166a7b	radv: handle the VRS attachment subpass Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10187>	2021-04-14 09:31:13 +00:00
Samuel Pitoiset	bb88f73ad3	radv: create an image for VRS if no depth/stencil attachment is bound The Vulkan spec doesn't require the application to always binds a depth/stencil attachment when a VRS attachment is used inside the same subpass. To handle this situation, the driver creates a global 4096x4096 VRS image that will be bind at draw-time if needed. This isn't super ideal but we have to do that unfortunately. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10187>	2021-04-14 09:31:13 +00:00
Samuel Pitoiset	ba7c510e1f	radv: allow HTILE for very small images if VRS attachment is used We need a HTILE buffer to store the VRS rates. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10187>	2021-04-14 09:31:13 +00:00
Samuel Pitoiset	7bd3a9f502	radv: update the HTILE clear word when VRS is used SR1 is the VRS x-rate. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10187>	2021-04-14 09:31:13 +00:00
Samuel Pitoiset	d075711b0e	radv: do not use the whole HTILE buffer for depth when VRS is used The stencil data needs to be included for storing the VRS rates into the HTILE buffer. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10187>	2021-04-14 09:31:13 +00:00
Samuel Pitoiset	73dac68cb8	radv: configure the VRS HTILE encoding size Any depth buffer can potentially use VRS. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10187>	2021-04-14 09:31:13 +00:00
Samuel Pitoiset	62033e0cb3	radv: determine if attachment VRS is enabled When VRS attachment, any depth buffer can potentially be used for VRS. We also have to create a global depth buffer if the app doesn't provide one. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10187>	2021-04-14 09:31:13 +00:00
Samuel Pitoiset	9141716481	radv: do not enable DCC for fragment shading rate attachments That's unnecessary. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10187>	2021-04-14 09:31:13 +00:00
Samuel Pitoiset	5980cd5768	radv: do not allow MSAA with fragment shading rate attachments The Vulkan spec requires the implementation to only supports VK_SAMPLE_COUNT_1_BIT with fragment shading rate attachments. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10187>	2021-04-14 09:31:13 +00:00
Samuel Pitoiset	ec6da922df	radv: expose R8_UINT as the only supported format for VRS attachments Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10187>	2021-04-14 09:31:13 +00:00
Samuel Pitoiset	f232c404d3	ac/surface: store the HTILE pitch to the surface This will be used to copy VRS rates to the HTILE buffer. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10187>	2021-04-14 09:31:13 +00:00
Samuel Pitoiset	b9c7c5098e	ac/surface: implement HtileAddrFromCoord in NIR Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10187>	2021-04-14 09:31:13 +00:00
Samuel Pitoiset	9fabbf2150	ac/surface: copy the HTILE equations to the surface Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10187>	2021-04-14 09:31:13 +00:00
Samuel Pitoiset	438e02fc0f	ac/surface: increase gfx9_meta_equation::gfx10_bits by 4 elements For the HTILE equation. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10187>	2021-04-14 09:31:13 +00:00
Samuel Pitoiset	9eee737364	ac/surface: rename gfx9_dcc_equation to gfx9_meta_equation gfx9_meta_equation will be used to store the HTILE equation. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10187>	2021-04-14 09:31:13 +00:00
Samuel Pitoiset	1edda9e878	ac/surface: add a test of HtileAddrFromCoord prototype outside of addrlib Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10187>	2021-04-14 09:31:13 +00:00
Samuel Pitoiset	64bd245c84	ac/surface: rename ac_surface_dcc_address_test.c This file will also contain HTILE equation tests. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10187>	2021-04-14 09:31:13 +00:00
Samuel Pitoiset	4d25229c24	amd/addrlib: expose HTILE address equations to drivers on GFX10+ Similar to the DCC address equations. Only GFX10+ because this is for copying VRS rates to the HTILE buffer. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10187>	2021-04-14 09:31:13 +00:00
Marek Olšák	1dff495057	ac/llvm: implement 16-bit packed VS outputs and FS inputs Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9051>	2021-04-13 21:10:43 -04:00
Daniel Schürmann	b6a28aaa8b	aco/cssa: don't create parallelcopies for constants and exec if we are able to spill these directly. Totals from 4913 (3.60% of 136546) affected shaders (Raven): SpillSGPRs: 16021 -> 15451 (-3.56%); split: -3.87%, +0.31% CodeSize: 58102020 -> 57371464 (-1.26%); split: -1.26%, +0.00% Instrs: 11411454 -> 11230105 (-1.59%); split: -1.59%, +0.00% Latency: 555706331 -> 550058635 (-1.02%); split: -1.07%, +0.05% InvThroughput: 273023354 -> 271854469 (-0.43%); split: -0.44%, +0.01% SClause: 385168 -> 385371 (+0.05%); split: -0.01%, +0.06% Copies: 1342084 -> 1175762 (-12.39%); split: -12.40%, +0.01% Branches: 392619 -> 378662 (-3.55%); split: -3.56%, +0.00% Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>	2021-04-13 18:40:57 +00:00
Daniel Schürmann	18ba93e673	aco/cssa: rewrite lower_to_cssa pass The previous pass was based on misconceptions and rounded up with bug fixes. The new pass is entirely rewritten and basically just one-to-one from the paper: "Revisiting Out-of-SSA Translation for Correctness, CodeQuality, and Efficiency" by B. Boissinot et al. It also incorporates the value-equality testing. The regressions are mainly due to creating parallelcopies for exec phis at loop headers (mitigated in the next commit). Totals from 4933 (3.61% of 136546) affected shaders (Raven): SpillSGPRs: 16249 -> 16527 (+1.71%); split: -0.28%, +1.99% SpillVGPRs: 1771 -> 1595 (-9.94%) CodeSize: 57544436 -> 58280304 (+1.28%); split: -0.00%, +1.28% Scratch: 176128 -> 179200 (+1.74%) Instrs: 11265783 -> 11445884 (+1.60%); split: -0.00%, +1.60% Latency: 552596156 -> 555880540 (+0.59%); split: -0.53%, +1.13% InvThroughput: 271431862 -> 273097423 (+0.61%); split: -0.18%, +0.79% VClause: 160240 -> 160241 (+0.00%); split: -0.02%, +0.02% SClause: 386863 -> 386685 (-0.05%); split: -0.07%, +0.02% Copies: 1180801 -> 1345633 (+13.96%); split: -0.02%, +13.98% Branches: 379129 -> 393052 (+3.67%); split: -0.01%, +3.69% Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>	2021-04-13 18:40:57 +00:00
Daniel Schürmann	9d73a4a412	aco: add new reindex_ssa() pass Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>	2021-04-13 18:40:57 +00:00
Daniel Schürmann	d75c73e6a6	aco: fix kill flags on phi operands Fossil-db changes are likely due to how the CSSA pass works. Totals from 1782 (1.31% of 136546) affected shaders (Raven): CodeSize: 25333292 -> 25294020 (-0.16%); split: -0.16%, +0.00% Instrs: 4916059 -> 4908218 (-0.16%); split: -0.16%, +0.00% Latency: 282860167 -> 282707176 (-0.05%); split: -0.08%, +0.03% InvThroughput: 136487564 -> 136394958 (-0.07%); split: -0.12%, +0.05% VClause: 74791 -> 74795 (+0.01%) Copies: 542115 -> 534280 (-1.45%); split: -1.48%, +0.04% Branches: 168977 -> 168966 (-0.01%); split: -0.01%, +0.01% Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>	2021-04-13 18:40:57 +00:00
Daniel Schürmann	13e4fed01f	aco: lower p_spill with constants correctly Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>	2021-04-13 18:40:57 +00:00
Daniel Schürmann	4a57787006	aco/spill: use correct next_use_distances at loop header To decide which variables to spill, we must use the distances at the beginning of the loop-header, and not the distances at the end of the loop-preheader. The difference are that the former includes phis which are viable to be spilled as opposed to the phi operands which would be reloaded by add_coupling_code(), ending up in potentially too high register pressure before the loop. Totals from 206 (0.15% of 136546) affected shaders (Raven): SpillSGPRs: 5154 -> 5000 (-2.99%) CodeSize: 3654072 -> 3647184 (-0.19%); split: -0.19%, +0.00% Instrs: 701482 -> 700526 (-0.14%); split: -0.14%, +0.00% Latency: 40988780 -> 40872506 (-0.28%); split: -0.29%, +0.00% InvThroughput: 20364560 -> 20306006 (-0.29%) SClause: 20192 -> 20198 (+0.03%) Copies: 77732 -> 77688 (-0.06%); split: -0.08%, +0.03% Branches: 24204 -> 24050 (-0.64%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>	2021-04-13 18:40:57 +00:00
Daniel Schürmann	b56ea19111	aco/spill: refactor live-in registerDemand calculation This also fixes some hypothetical issue for loops without phis and for loops with higher register pressure at the end of the loop preheader. No fossil-db changes. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>	2021-04-13 18:40:57 +00:00
Daniel Schürmann	282eacc3e0	aco/spill: refactor some more spill decision taking No fossil-db changes. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>	2021-04-13 18:40:57 +00:00
Daniel Schürmann	dfb10e4f4b	aco/spill: don't count phis as variable access This increases the chance of evicting phis if these have longer next-use distances. Totals from 6 (0.00% of 146267) affected shaders (Navi10): CodeSize: 476992 -> 464388 (-2.64%) Instrs: 81785 -> 79952 (-2.24%) VClause: 2380 -> 2374 (-0.25%) Copies: 26836 -> 25131 (-6.35%) Branches: 2494 -> 2492 (-0.08%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>	2021-04-13 18:40:57 +00:00

1 2 3 4 5 ...

7161 commits