fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-21 17:38:08 +02:00

Author	SHA1	Message	Date
James Park	a64b36ecaf	ac/surface: Move drm_fourcc.h to common header Useful for including from RADV without copy/paste. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9709>	2021-04-22 08:16:11 +00:00
Rhys Perry	5760386654	radv: only set robust_modes if robustBufferAccess2 is enabled Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10384>	2021-04-21 20:26:58 +00:00
Rhys Perry	8408d0312f	radv: improve vectorization callback for small bit sizes More accurately reflect the hardware's capabilities for byte and short aligned VMEM operations. fossil-db (GFX10.3): Totals from 65 (0.05% of 139391) affected shaders: SGPRs: 4296 -> 4200 (-2.23%) CodeSize: 1000984 -> 1000368 (-0.06%); split: -0.13%, +0.07% Instrs: 177504 -> 177380 (-0.07%); split: -0.17%, +0.10% Cycles: 36820596 -> 36812792 (-0.02%); split: -0.15%, +0.13% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10384>	2021-04-21 20:26:58 +00:00
Rhys Perry	776ba40115	aco: add and use Program::progress This is used when printing the program and to avoid updating register demand during post-RA liveness analysis. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10315>	2021-04-21 11:09:33 +00:00
Rhys Perry	2d36232e62	aco: allow SDWA sels smaller than the operand size p_extract_vector copy-propagation can create byte sels for v2b operands. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10315>	2021-04-21 11:09:33 +00:00
Rhys Perry	655ba1e3a9	aco: don't update register demand during RA validation It isn't intended to be accurate after RA, so num_waves can become zero, breaking the sgpr_limit calculation. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10315>	2021-04-21 11:09:33 +00:00
Samuel Pitoiset	1cf39001cd	radv: allow concurrent MSAA images to be FMASK compressed DCC decompress/FMASK expand are supported on compute queues. Since the driver doesn't perform fast clears with concurrent images, we don't need to perform a FCE on compute (we can't anyways). This fixes a performance regression with Control and VKD3D_CONFIG=multi_queue. One more optimization (as discussed with Bas) is to implement FCE on compute to allow fast clears. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10323>	2021-04-21 08:19:14 +00:00
Rhys Perry	0eaa5dfac0	aco: remove image parameter from get_sampler_desc() We can just check whether tex_instr is NULL instead. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10036>	2021-04-20 17:42:21 +00:00
Rhys Perry	6a7b89c89d	ac/nir: set TRUNC_COORD=0 for nir_texop_tg4 Fixes black squares in Assassin's Creed: Valhalla and rendering of FidelityFX-CACAO demo. shader-db (sienna cichlid): Totals: SGPRS: 2977068 -> 2977220 (0.01 %) VGPRS: 1929624 -> 1929616 (-0.00 %) Spilled SGPRs: 5769 -> 5769 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 61423092 -> 61424672 (0.00 %) bytes Max Waves: 895765 -> 895766 (0.00 %) Totals from affected shaders: SGPRS: 9520 -> 9672 (1.60 %) VGPRS: 7464 -> 7456 (-0.11 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 527432 -> 529012 (0.30 %) bytes Max Waves: 1819 -> 1820 (0.05 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Fixes: `58f25098a0` ("radv: Use TRUNC_COORD on samplers") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10036>	2021-04-20 17:42:21 +00:00
Rhys Perry	3cbe9894f7	aco: set TRUNC_COORD=0 for nir_texop_tg4 Fixes black squares in Assassin's Creed: Valhalla and rendering of FidelityFX-CACAO demo. fossil-db (sienna cichlid): Totals from 3052 (2.09% of 146267) affected shaders: SpillSGPRs: 8437 -> 8646 (+2.48%) CodeSize: 30993832 -> 31116916 (+0.40%); split: -0.00%, +0.40% Instrs: 5869934 -> 5886783 (+0.29%); split: -0.00%, +0.29% Latency: 250330521 -> 250463770 (+0.05%); split: -0.00%, +0.05% InvThroughput: 59797617 -> 59814584 (+0.03%); split: -0.00%, +0.03% VClause: 92114 -> 92132 (+0.02%) SClause: 197373 -> 197338 (-0.02%); split: -0.02%, +0.01% Copies: 479482 -> 482394 (+0.61%); split: -0.01%, +0.61% Branches: 219629 -> 219635 (+0.00%) PreSGPRs: 248970 -> 249366 (+0.16%) fossil-db (polaris10): Totals from 3050 (2.06% of 147787) affected shaders: SGPRs: 282864 -> 282912 (+0.02%); split: -0.01%, +0.02% VGPRs: 242572 -> 242612 (+0.02%) SpillSGPRs: 10387 -> 10675 (+2.77%) CodeSize: 31872460 -> 31996128 (+0.39%) MaxWaves: 10924 -> 10925 (+0.01%) Instrs: 6222217 -> 6239072 (+0.27%) Latency: 317482545 -> 317773685 (+0.09%); split: -0.00%, +0.09% InvThroughput: 156149624 -> 156242072 (+0.06%); split: -0.00%, +0.06% VClause: 92295 -> 92254 (-0.04%); split: -0.05%, +0.01% SClause: 243342 -> 243321 (-0.01%); split: -0.01%, +0.00% Copies: 678902 -> 681700 (+0.41%); split: -0.00%, +0.41% Branches: 219698 -> 219703 (+0.00%) PreSGPRs: 244251 -> 244644 (+0.16%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Fixes: `58f25098a0` ("radv: Use TRUNC_COORD on samplers") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3110 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10036>	2021-04-20 17:42:21 +00:00
Samuel Pitoiset	6aaa325f89	radv: remove radv_image_iview::multiplane_planes Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10218>	2021-04-20 12:16:33 +00:00
Samuel Pitoiset	8198aeac8d	radv: remove radv_image_iview::bo This saves one 64-bit pointer. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10218>	2021-04-20 12:16:33 +00:00
Bas Nieuwenhuizen	9da4590df8	amd/common: Use cap to test kernel modifier support. Turns out both kernel v5.10 and v5.11 have the same amdgpu driver version and only one has modifiers ... In addition the version check is kinda annoying for backports. So lets use the cap. Since the cap is technically about ADDFB2 I tested that this works on rendernodes (and reading the code there is no distinction from what kind of node this is called). Fixes: `9a937330ef` ("radeonsi: Only set modifier creation function for GFX9+ & with kernel support.") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10337>	2021-04-20 11:59:43 +00:00
Samuel Pitoiset	1d3542694b	radv: fix emitting depth bias when beginning a command buffer If depth bias is enabled but zero values used, they were never emitted to the command buffer because they are equal to the default values. Previously, they were always emitted when the bound DS attachment changed. This should fix some sort of Z fighting with Dota2 on all GPUs. This also fixes a different issue (ie. some occlusion queries failures) on GFX6 because CLEAR_STATE is not used on that chip. Fixes: `8a47422d97` ("radv: do not scale the depth bias for D16_UNORM depth surfaces") Cc: 21.1 mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10310>	2021-04-20 09:09:38 +00:00
Samuel Pitoiset	e4c0724dc6	radv: fix fast clearing depth-only or stencil-only aspects with HTILE DB isn't coherent with L2 on GFX6-8. This is needed when the clear HTILE mask path is selected. This fixes an issue with avatars in Heroes of The Storm. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3362 Cc: 21.1 mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10277>	2021-04-20 08:20:49 +00:00
Bas Nieuwenhuizen	a144fa608d	radv: Fix memory leak on descriptor pool reset with layout_size=0. Gotta track those sets too to free them. Alse changed the search on destroy to check for set instead of offset since offset is not necessarily unique anymore. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4652 CC: mesa-stable Reviewed-by: Joshua Ashton <joshua@froggi.es> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10317>	2021-04-19 19:19:58 +00:00
Samuel Pitoiset	9434675d60	aco: fix opquantize2f16 on GFX6-7 Make sure to preserve signed zeroes. Fixes dEQP-VK.spirv_assembly.instruction.compute.opquantize.flush_to_zero on GFX6 (Pitcairn). Untested on GFX7. Fixes: `54a09545ec` ("aco: optimize a*0.0") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10319>	2021-04-19 16:33:37 +00:00
Marek Olšák	ec1ddb976a	amd/registers: rename IMG_FORMAT to GFX10_FORMAT to disambiguate the meaning Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10261>	2021-04-17 02:37:49 +00:00
Marek Olšák	3e0ce4af4f	amd/registers: clean up gfx103.json because gfx103.json is automatically generated and can't be changed manually. This fixes the file generator without changing the generated header. Missing registers must be in registers-manually-defined.json, and missing fields must be in parse_kernel_headers.py. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10261>	2021-04-17 02:37:49 +00:00
Marek Olšák	a142925b7a	amd/registers: fix the kernel header parser with latest headers Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10261>	2021-04-17 02:37:49 +00:00
Rhys Perry	86d903e88d	radv: fix clearing DCC-compressed e5b9g9r9 images Fixes dEQP-VK.api.image_clearing.core.clear_color_image.2d.optimal.single_layer.e5b9g9r9_ufloat_pack32_33x128 with RADV_DEBUG=forcecompress on GFX10.3. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: 21.1 <mesa-stable> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10176>	2021-04-16 11:56:32 +01:00
Samuel Pitoiset	66e1b42d06	radv: keep DCC compressed for clears on compute with image stores Without image stores, DCC is always decompressed on compute. Cc: 21.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10168>	2021-04-16 09:47:52 +00:00
Marek Olšák	84895dba7f	amd: remove some references to older LLVM versions in comments Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10199>	2021-04-16 09:25:19 +00:00
Marek Olšák	b878444c3a	amd: drop support for LLVM 10 It doesn't support RDNA 2. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10199>	2021-04-16 09:25:19 +00:00
Marek Olšák	2747332723	amd: drop support for LLVM 9 This would be easy to support except that it doesn't support RDNA 2. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10199>	2021-04-16 09:25:19 +00:00
Samuel Pitoiset	936b58378c	amd: drop support for LLVM 8 It doesn't support Navi1x and the removal enables this nice code cleanup. v2: rebase - mareko Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v1) Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10199>	2021-04-16 09:25:19 +00:00
Michel Dänzer	d200f45875	Use explicit break instead of fall-through to break-only case clang generates a warning if there's no explicit break or fall-through annotation. The latter would be kind of silly in this case, and not robust against any future changes turning the fall-through invalid. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>	2021-04-15 16:01:22 +00:00
Michel Dänzer	2928c21eb7	Convert most remaining free-form fall-through comments to FALLTHROUGH One exception is src/amd/addrlib/, for which -Wimplicit-fallthrough is explicitly disabled. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>	2021-04-15 16:01:22 +00:00
Rhys Perry	ec70882238	radv: fix barrier in radv_decompress_dcc_compute shader ACO doesn't create a waitcnt for barriers between texture samples and image stores because texture samples are supposed to use read-only memory. It could also schedule the barrier to above the texture sample. We also have use a larger memory scope to avoid an ACO optimization. Tested on GFX8 with Sachsa Willems deferred sample. With some DCC decompressions and the compute path forced. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: 21.1 <mesa-stable> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9496>	2021-04-15 12:02:36 +00:00
Hans-Kristian Arntzen	08fdaec473	radv: Allocate buffer list for MUTABLE descriptor types as well. Fixes: `86644b84b9` ("radv: Implement VK_VALVE_mutable_descriptor_type.") Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10132>	2021-04-15 11:51:33 +00:00
Hans-Kristian Arntzen	b60bc59180	radv: Take image alignment into account when allocating MUTABLE pool. Allocating a descriptor set is aligned to 32 bytes, so just like the other buffer types, bump the descriptor size to 32 bytes when allocating MUTABLE descriptor types from a pool. Fixes: `86644b84b9` ("radv: Implement VK_VALVE_mutable_descriptor_type.") Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10132>	2021-04-15 11:51:33 +00:00
Simon Ser	35e25ea1d0	ac/surface: allow non-DCC modifiers for YUV on GFX9+ Accept non-linear tiling for multi-planar formats on GFX9+, as long as DCC is disabled. DCC support is possible in theory, but untested for now. GFX8 is still restricted to linear tiling because it's not yet clear how modifiers should be handled on these chips for multi-planar formats. Each plane may need a different modifier. Signed-off-by: Simon Ser <contact@emersion.fr> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Leo Liu <leo.liu@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10134>	2021-04-15 09:43:17 +00:00
Simon Ser	19378dfe3c	ac/surface: use blocksizebits instead of blocksize util_format_get_blocksize asserts that the blocksize isn't zero. However the blocksize will be zero if the format's channel encoding is unspecified. The channel encoding is only meaningful for the plain u_format layout, so util_format_get_blocksize can't be used for formats with another layout. For example, YUV formats don't have the channel encoding specified. Use util_format_get_blocksizebits, which just returns zero without an assertion for formats which don't have a channel encoding. Signed-off-by: Simon Ser <contact@emersion.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10134>	2021-04-15 09:43:17 +00:00
Rhys Perry	5b8a4516e6	aco/ra: remove live-in temporary from live_out_per_block when moving it Otherwise, handle_loop_phis() might pass it to handle_live_in() and then we could have two phis for this variable. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `7c64623e94` ("aco/ra: refactor SSA repairing during register allocation") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10236>	2021-04-14 19:04:08 +00:00
Rhys Perry	11fde1247c	aco/ra: use original names when renaming loop carried phi operands Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `7c64623e94` ("aco/ra: refactor SSA repairing during register allocation") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10236>	2021-04-14 19:04:08 +00:00
Samuel Pitoiset	97e7b21c42	ac: add missing BUF_DATA_FORMAT_10_11_11 vertex format on GFX10+ This format is supported by the driver. Fixes vertex explosion in Dirt 5. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4635 Cc: mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10226>	2021-04-14 18:07:41 +00:00
Bas Nieuwenhuizen	8ddbac0377	radv/winsys: Remove use_local_bos Now that perftest is stored in the winsys. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10198>	2021-04-14 15:16:17 +00:00
Bas Nieuwenhuizen	284bc57a49	radv: Use VRAM cmdbuffers in more situations. In most games I tested we use 32 MiB of cmdbuffers+cmd upload buffers at most. Especially since we have mutable descriptors it seems somewhat unlikely anything else will eat it up so be a bit more aggressive allocating them in VRAM. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10198>	2021-04-14 15:16:17 +00:00
Bas Nieuwenhuizen	057ec395a4	radv: Refactor cs_domain to be a winsys function. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10198>	2021-04-14 15:16:17 +00:00
Timur Kristóf	f3e004cb56	aco: Add a simple heuristic to decide early or late primitive export. Late export is theoretically better if used with LATE_ALLOC, but in practice, the early export has an advantage of lower register usage, therefore more concurrent waves. The idea of this commit is that "small" shaders benefit from early primitive export more, due to being able to launch much more waves. Let's consider a NIR shader "small" when it has only 1 block. This yields both better performance, and better stats, than always using late export. Fossil DB on Sienna: Totals from 12807 (8.76% of 146265) affected shaders: VGPRs: 609128 -> 620216 (+1.82%); split: -0.01%, +1.83% SpillSGPRs: 1458 -> 1538 (+5.49%) CodeSize: 37028204 -> 37019320 (-0.02%); split: -0.17%, +0.14% MaxWaves: 282902 -> 278516 (-1.55%) Instrs: 7163142 -> 7162925 (-0.00%); split: -0.18%, +0.18% VClause: 169285 -> 169547 (+0.15%); split: -1.15%, +1.30% SClause: 267373 -> 267151 (-0.08%); split: -0.24%, +0.16% Copies: 446442 -> 444567 (-0.42%); split: -2.68%, +2.26% Branches: 156245 -> 156195 (-0.03%); split: -0.30%, +0.26% PreSGPRs: 434701 -> 447396 (+2.92%) PreVGPRs: 527783 -> 540527 (+2.41%) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106>	2021-04-14 14:25:10 +00:00
Timur Kristóf	5dbab03a80	aco: Emit fewer branches for NGG VS/TES with late primitive export. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106>	2021-04-14 14:25:10 +00:00
Timur Kristóf	af7d5f5b86	aco: Set block_kind_export_end in create_vs/fs_exports. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106>	2021-04-14 14:25:10 +00:00
Timur Kristóf	2b312a4fd7	aco: Extract ngg_nogs_export_prim_id to a separate function. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106>	2021-04-14 14:25:10 +00:00
Timur Kristóf	231ef14b3d	aco: Use s_setprio 3 at the beginning of every VS and TES. The user-set priority of shaders matters very little, but we hope this might still help speed up VS input loads especially. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106>	2021-04-14 14:25:10 +00:00
Timur Kristóf	4c86c7aa15	aco: Remove useless s_setprio near gs_alloc_req. We learned that the gs_alloc_req is not actually when the export space allocation happens. So it makes no sense to prioritize it. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106>	2021-04-14 14:25:10 +00:00
Timur Kristóf	75cd43741a	aco: Align NGG scratch size to 16 so a single ds_read can always read it. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10155>	2021-04-14 14:05:24 +00:00
Timur Kristóf	c1346e5c22	aco: Optimize workgroup exclusive scan to better avoid bank conflicts. Previously, every wave had multiple active lanes read the LDS, and the data was processed by VALU DPP instructions. Now, only the first lane reads the LDS in order to avoid bank conflicts, and the results are processed by SALU. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10155>	2021-04-14 14:05:24 +00:00
Rhys Perry	e3ebc1ca4b	radv: fix conditions for running nir_opt_vectorize No fossil-db changes, probably because all fp16 shaders have at least one 16-bit mov or vec2 somehwere. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10227>	2021-04-14 12:27:06 +00:00
Samuel Pitoiset	e24049da63	radv: advertise attachmentFragmentShadingRate on GFX10.3 Layered VRS attachments is for later. The CTS failures are similar to the existing ones, I will investigate soon. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10187>	2021-04-14 09:31:13 +00:00
Samuel Pitoiset	ee77dde396	radv: configure the VRS combiners when an attachment is used Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10187>	2021-04-14 09:31:13 +00:00

... 14 15 16 17 18 ...

7942 commits