fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-18 20:18:06 +02:00

Author	SHA1	Message	Date
Qiang Yu	035d70f721	ac/nir/ngg,radv: use nir_load_viewport_xy_scale_and_offset Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17651>	2022-08-26 05:50:30 +00:00
John Brooks	98ba1e0d81	radv: Fix mipmap views on GFX10+ As explained in the previous commit, GFX9+ has issues with addressing mipmaps in block-compressed images. In the case of copy commands, we fix this by doing an extra copy for the missing blocks. For GFX10, the mipmap layout in memory allows us to do better than that. We can change the base level of the descriptor to one level bigger than the requested level and adjust the extent and address to match. This is done by ComputeNonBlockCompressedView in addrlib. Thus on GFX10 we can skip the fixup copy workaround, and this will also fix cases outside of explicit copy commands. Signed-off-by: John Brooks <john@fastquake.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17970>	2022-08-23 19:01:18 +00:00
John Brooks	35f053ba8c	radv: Fix corrupted mipmap copies on GFX9+ GFX9+ hardware has an issue where mipmap degradations are calculated incorrectly due to using divide-by-two integer math and certain mipmap sizes lose blocks. This issue has been documented before, and we ported a workaround from AMDVLK to increase the extent that is programmed into the descriptor, so that the hardware arrives at the correct result. However, this is insufficient as we cannot safely increase the extent beyond the physical extent of the image in memory. If we can't increase it enough, the image will still be missing blocks. But there is still hope. In cases where RADV is responsible for copying to or from an image (such as vkCmdCopyBufferToImage/vkCmdCopyImageToBuffer), we can perform a second copy of the blocks that the hardware excluded so that the resulting image is complete. This is another workaround from AMDVLK. This fixes corrupted textures in Halo: The Master Chief Collection. v2: Add RADV_CMD_FLAG_INV_L2 \| RADV_CMD_FLAG_INV_VCACHE to flush_bits just in case (Samuel Pitoiset) Closes: #3347 Signed-off-by: John Brooks <john@fastquake.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17970>	2022-08-23 19:01:17 +00:00
Friedrich Vock	50238f4958	amd/common: Remove redundant code for determining memory ops per clock Fixes: `82fd379d9e` ("amd/common: move ac_memory_ops_per_clock into ac_gpu_info.h") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18038>	2022-08-16 19:06:21 +00:00
Friedrich Vock	82fd379d9e	amd/common: move ac_memory_ops_per_clock into ac_gpu_info.h Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17973>	2022-08-10 10:58:54 +00:00
Timur Kristóf	dccd6f495a	ac/nir/cull: Fix typo in bounding box culling. Bounding box culling is only viable when the W of all vertices are positive. Always accept triangles whose any W is negative. Fixes: `0d527bb1aa` Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7018 Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17929>	2022-08-08 11:16:04 +00:00
Timur Kristóf	e035c289b5	ac/nir/cull: Tweak phi for cull_small_primitive branch. cull_small_primitive will now allow the caller to pass an SSA def that it will use to determine if the primitive was initially rejected. This allows ACO to remove an s_branch instruction from every NGG culling shader. Fossil DB stats on Navi 21: Totals from 60918 (45.16% of 134906) affected shaders: CodeSize: 160086644 -> 159355824 (-0.46%); split: -0.46%, +0.00% Instrs: 30477916 -> 30356092 (-0.40%); split: -0.40%, +0.00% Latency: 139587915 -> 139611487 (+0.02%); split: -0.00%, +0.02% InvThroughput: 21184261 -> 21184346 (+0.00%) Copies: 2762930 -> 2702024 (-2.20%); split: -2.20%, +0.00% Branches: 1236970 -> 1176052 (-4.92%) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17919>	2022-08-06 20:43:36 +00:00
Timur Kristóf	e4b0caae61	ac/nir/cull: Make cull functions more consistent. Now they all return whether the primitive was rejected. No Fossil DB changes. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17870>	2022-08-05 22:10:28 +00:00
Timur Kristóf	c721f751f2	ac/nir/ngg: Move LDS store of accepted flag into the inner branch. For primitives which are rejected based on only W and face, this will reduce the number of executed branches. Fossil DB stats on Navi 21: Totals from 60918 (45.16% of 134906) affected shaders: CodeSize: 160330564 -> 160086644 (-0.15%) Instrs: 30477385 -> 30477916 (+0.00%); split: -0.00%, +0.00% Latency: 139802763 -> 139587915 (-0.15%); split: -0.15%, +0.00% InvThroughput: 21198444 -> 21184261 (-0.07%); split: -0.07%, +0.00% SClause: 749811 -> 749810 (-0.00%) Copies: 2701482 -> 2762930 (+2.27%); split: -0.00%, +2.28% Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17870>	2022-08-05 22:10:28 +00:00
Timur Kristóf	0d527bb1aa	ac/nir/cull: Change if condition for bounding box culling. The previous code checked all_w_positive in the if condition. Instead, always execute the bbox culling code and include all_w_positive at the end. We assume checking in the if is not beneficial because it's very unlikely that there is no primitive in a wave whose W are not all positive. This allows moving other things to the condition in the next commit. Fossil DB stats on Navi 21: Totals from 60918 (45.16% of 134906) affected shaders: CodeSize: 160574204 -> 160330564 (-0.15%); split: -0.15%, +0.00% Instrs: 30538297 -> 30477385 (-0.20%); split: -0.20%, +0.00% Latency: 139810902 -> 139802763 (-0.01%); split: -0.01%, +0.00% InvThroughput: 21198449 -> 21198444 (-0.00%); split: -0.00%, +0.00% SClause: 749810 -> 749811 (+0.00%) Copies: 2701474 -> 2701482 (+0.00%); split: -0.00%, +0.00% Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17870>	2022-08-05 22:10:28 +00:00
Timur Kristóf	fb4e68b724	ac/nir/cull: Move the contents of cull_bbox into ac_nir_cull_triangle. No Fossil DB changes. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17870>	2022-08-05 22:10:28 +00:00
Timur Kristóf	e2ca24063a	ac/nir/cull: Move some code from cull_bbox into helper functions. No Fossil DB changes. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17870>	2022-08-05 22:10:27 +00:00
Marek Olšák	0c1801706e	ac/llvm: handle external textures in ac_nir_lower_resinfo Fixes: `4f622d62d0` - ac/nir: add ac_nir_lower_resinfo Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6993 Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17902>	2022-08-05 09:04:17 +00:00
Marek Olšák	4f622d62d0	ac/nir: add ac_nir_lower_resinfo Emulating image_get_resinfo should be faster than using the hw. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17693>	2022-08-03 17:44:15 +00:00
Marek Olšák	f129db911b	radeonsi/gfx11: use a better workaround for the export conflict bug This is recommended for better performance. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17864>	2022-08-03 00:57:16 +00:00
Marek Olšák	2ed9eb1b63	radeonsi/gfx11: enable shader prefetch except for initial chip revisions Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17864>	2022-08-03 00:57:16 +00:00
Marek Olšák	9e9cc62912	radeonsi: follow shader_info.float_controls_execution_mode (mostly) Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17864>	2022-08-03 00:57:16 +00:00
Marek Olšák	5c0b0f0058	ac/surface: don't forbid 256KB swizzle modes on smaller gfx11 chips let addrlib make the right choice Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17864>	2022-08-03 00:57:16 +00:00
Marek Olšák	bc85e79bba	ac/gpu_info: require amdgpu DRM 3.15.0 (kernel 4.12) from July 2017 to match the radeon requirement Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411>	2022-07-27 05:01:38 +00:00
Marek Olšák	9f6a64b1c3	gallium/radeon: require radeon DRM 2.50.0 (kernel 4.12) from July 2017 This is the latest radeon DRM version. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411>	2022-07-27 05:01:38 +00:00
Marek Olšák	8426cf9132	ac/gpu_info: remove unused has_unaligned_shader_loads Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411>	2022-07-27 05:01:38 +00:00
Marek Olšák	f3f00f77ad	ac/gpu_info: remove amdgpu_gpu_info parameter from ac_query_gpu_info Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411>	2022-07-27 05:01:38 +00:00
Marek Olšák	41888505fc	ac/gpu_info: use drm_amdgpu_device_info instead of amdgpu_gpu_info These fields are identical but the latter is from libdrm. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411>	2022-07-27 05:01:38 +00:00
Marek Olšák	abd188ec1c	radeonsi: remove workarounds for radeon DRM < 2.45.0 Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411>	2022-07-27 05:01:38 +00:00
Marek Olšák	3657cdafd6	amd: require amdgpu DRM 3.2.0 from April 2016 This removes an early bug workaround. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411>	2022-07-27 05:01:38 +00:00
Marek Olšák	ff19666a0d	ac/gpu_info: remove redundant vcn_encode Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411>	2022-07-27 05:01:38 +00:00
Marek Olšák	89113c0338	ac/gpu_info: remove redundant vce_encode Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411>	2022-07-27 05:01:38 +00:00
Marek Olšák	9cbbdc6583	ac/gpu_info: remove redundant uvd_encode Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411>	2022-07-27 05:01:38 +00:00
Marek Olšák	2972ceccfd	ac/gpu_info: remove redundant jpeg_decode Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411>	2022-07-27 05:01:38 +00:00
Marek Olšák	a0d2e16c91	ac/gpu_info: remove redundant uvd_decode Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411>	2022-07-27 05:01:38 +00:00
Marek Olšák	12c5d64fae	ac/gpu_info: remove vram_size and gtt_size in favor of *_kb variants Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411>	2022-07-27 05:01:38 +00:00
Marek Olšák	983223de5d	ac/gpu_info: use the kernel-reported GFX IP version to set gfx_level hopefully this won't break the world. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411>	2022-07-27 05:01:38 +00:00
Marek Olšák	6504d7172c	ac/gpu_info: use hw_ip::ip_discovery_version to set IP versions Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411>	2022-07-27 05:01:38 +00:00
Marek Olšák	9552da66cc	ac/gpu_info: rework and extend device info to be more informative This is the result with AMD_DEBUG=info: Device info: name = NAVI23 marketing_name = AMD Radeon RX 6600 num_se = 2 num_rb = 8 num_cu = 28 max_gpu_freq = 2750 MHz max_gflops = 9856 GFLOPS l0_cache_size = 16 KB l1_cache_size = 128 KB l2_cache_size = 2048 KB l3_cache_size = 32 MB memory_channels = 8 (TCC blocks) memory_size = 8 GB (8192 MB) memory_freq = 14 GHz memory_bus_width = 128 bits memory_bandwidth = 224 GB/s clock_crystal_freq = 100000 KHz IP GFX 10.3 queues:1 IP COMP 10.3 queues:4 IP SDMA 5.2 queues:2 IP VCN_DEC 3.0 queues:1 IP VCN_ENC 3.0 queues:1 IP VCN_JPG 3.0 queues:1 It might not be 100% correct with other chips. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411>	2022-07-27 05:01:38 +00:00
Marek Olšák	dd6b001775	ac/gpu_info: remove tabs Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411>	2022-07-27 05:01:38 +00:00
Marek Olšák	f218c3d795	ac/gpu_info: rename info fields to num_cu, memory_bus_width, memory_freq_mhz Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411>	2022-07-27 05:01:38 +00:00
Timur Kristóf	8d7ca7783b	ac/nir/ngg: Remember proper bit sizes of GS output variables. The LLVM backend keeps track of 16-bit output variables and it will miscompile shaders when these outputs aren't the correct bitsize. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17706>	2022-07-22 08:17:39 +00:00
Timur Kristóf	e60fbb4dc9	ac/nir/ngg: Copy comment about LDS layout for NGG GS. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17706>	2022-07-22 08:17:39 +00:00
Timur Kristóf	dd781c1ccb	ac/nir/ngg: Create output variable for primitive ID export. This makes the RADV/LLVM backend happy and mitigates a crash. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17581>	2022-07-21 21:53:29 +00:00
Timur Kristóf	b0a7db1d3b	ac/nir/ngg: Move primitive ID workgroup barrier to proper place. Previously, it was in a divergent branch, therefore it could hang the GPU when a workgroup had a primitive-only wave. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17581>	2022-07-21 21:53:29 +00:00
Qiang Yu	754e43369d	ac/nir/ngg: Decouple primitive ID store and primitive export. There's no dependency between them. This can simplify the compiler backend translation by always storing prim id before vertex export, which also benefits the LLVM backend in latter changes. Signed-off-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17581>	2022-07-21 21:53:29 +00:00
Timur Kristóf	822e370390	radv: Allow reusing pipeline compute state emit functions. We are going to reuse them outside of radv_pipeline. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16531>	2022-07-20 19:00:30 +00:00
Qiang Yu	eeaf0b1888	ac/nir/ngg: add a barrier before prim id export When culling enabled, it will use LDS space, which overlap with the prim id export. Fixes: `e97f0463a8` ("ac/nir: Implement NGG deferred attribute culling in NIR.") Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17593>	2022-07-18 09:50:09 +00:00
Qiang Yu	0b7ef846b3	ac/nir/ngg: fix nogs culling scratch size Should be in bytes not dwords. Fixes: `e97f0463a8` ("ac/nir: Implement NGG deferred attribute culling in NIR.") Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17593>	2022-07-18 09:50:09 +00:00
Marek Olšák	9a39da359e	ac/surface: expose all 64K_R_X and 256K_R_X modifiers on gfx11 Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17410>	2022-07-09 21:00:51 +00:00
Marek Olšák	3514b73244	amd: update addrlib - trivial changes Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17410>	2022-07-09 21:00:51 +00:00
Bas Nieuwenhuizen	10211913e1	radv: Add perf counter lock/unlock commandbuffers. These set the pass and make sure we don't have multiple submissions at the same time touching the perf counters/pass at the same time. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16879>	2022-07-09 12:29:06 +00:00
Bas Nieuwenhuizen	6cfc2e91e8	radv: Add performance counter reg write. Needed for reliably writing performance counter selectors. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16879>	2022-07-09 12:29:05 +00:00
Timur Kristóf	8bfeb467bf	ac/nir/ngg: Ignore driver location for mesh shader outputs. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17244>	2022-07-01 18:09:07 +00:00
Timur Kristóf	2ac3e921e3	ac/nir/ngg: Refactor LDS instructions in NGG GS vertex emit and export. Change NGG GS emit vertex code to emit combined shared stores, also change the export vertex code to emit combined shared loads. This results in more optimal code generation, ie. fewer LDS instructions are generated. GS vertices are stored using an odd stride to minimize the chance of bank conflicts, which means that unfortunately we still can't use an alignment higher than 4 here, so the best we can get are some ds_read2_b32 instructions. Fossil DB stats on Navi 21 (formerly Sienna Cichlid): Totals from 135 (0.10% of 128653) affected shaders: VGPRs: 6416 -> 6512 (+1.50%) CodeSize: 529436 -> 503792 (-4.84%) MaxWaves: 2952 -> 2924 (-0.95%) Instrs: 93384 -> 90176 (-3.44%) Latency: 290283 -> 293611 (+1.15%); split: -0.36%, +1.50% InvThroughput: 81218 -> 82598 (+1.70%) Copies: 6603 -> 6606 (+0.05%) PreVGPRs: 5037 -> 5076 (+0.77%) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11425>	2022-06-30 18:15:50 +02:00

1 2 3 4 5 ...

1980 commits