fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-14 07:58:16 +02:00

Author	SHA1	Message	Date
Rhys Perry	c122315702	aco: fix get_ssbo_size with a vgpr resource The result of load_vulkan_descriptor is passed directly to get_ssbo_size. This caused convert_pointer_to_64_bit() to skip creating a v_readfirstlane_b32 if it was necessary. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `05b6612b4e` ('radv: do not lower UBO/SSBO access to offsets') Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3628 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7095>	2020-10-13 14:20:28 +00:00
Rhys Perry	bcf7a70008	aco: use nir_opt_uniform_atomics Significantly improves performance of a Control compute shader. Also seems to increase FPS at the very start of the game by ~9% (RX 580, 1080p, medium settings, no MSAA). fossil-db (Navi): Totals from 315 (0.23% of 135946) affected shaders: SGPRs: 18296 -> 18336 (+0.22%); split: -0.26%, +0.48% VGPRs: 11856 -> 11844 (-0.10%); split: -0.81%, +0.71% CodeSize: 2233800 -> 2457508 (+10.01%) MaxWaves: 4506 -> 4497 (-0.20%); split: +0.04%, -0.24% Instrs: 438766 -> 486215 (+10.81%); split: -0.00%, +10.81% Cycles: 7880180 -> 8963340 (+13.75%); split: -0.00%, +13.75% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6558>	2020-10-13 12:47:21 +00:00
Rhys Perry	e1120f274f	nir: move divergence analysis options to nir_shader_compiler_options Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6558>	2020-10-13 12:47:21 +00:00
Rhys Perry	bb5c0ba0d2	aco: implement last_invocation Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6558>	2020-10-13 12:47:21 +00:00
Rhys Perry	8850a63161	radv/aco,nir/lower_subgroups: don't lower elect ACO can implement this better. fossil-db (Navi): Totals from 33 (0.02% of 135946) affected shaders: SGPRs: 1736 -> 1744 (+0.46%) VGPRs: 1680 -> 1656 (-1.43%) CodeSize: 246160 -> 245916 (-0.10%); split: -0.14%, +0.04% MaxWaves: 449 -> 461 (+2.67%) Instrs: 48301 -> 48266 (-0.07%); split: -0.12%, +0.05% Cycles: 469740 -> 469240 (-0.11%); split: -0.18%, +0.08% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6558>	2020-10-13 12:47:20 +00:00
Rhys Perry	36da9c4aa2	aco: implement elect Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6558>	2020-10-13 12:47:20 +00:00
Rhys Perry	bf77f539ee	aco: optimize more uniform reductions/scans Uniform atomic optimization will create these. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6558>	2020-10-13 12:47:20 +00:00
Samuel Pitoiset	b9ca4923d6	aco: implement missing nir_op_unpack_half_2x16_split_{x,y}_flush_to_zero SPIRV->NIR emits nir_op_unpack_half_2x16_flush_to_zero instead of nir_op_unpack_half_2x16 if the shader enables denorm flush to zero for 16-bit floating point. This doesn't fix anything known and CTS doesn't have tests. Fixes: `56d9bcdded` ("radv: enable more float_controls features") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6939>	2020-10-13 08:35:22 +02:00
Eric Engestrom	c02e933de4	radv: add missing u_atomic.h include Fixes: `7568c97df1` ("radv: Use atomics to read query results.") Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7050>	2020-10-12 23:24:18 +02:00
Bas Nieuwenhuizen	1fb3e1fb70	radv: Fix mipmap extent adjustment on GFX9+. With arrays we really have to use the correct size for the base mipmap to get the right array pitch. In particular, using surf_pitch results in pitch that is bigger than the base mipmap and hence results in wrong pitches computed by the HW. It seems that on GFX9 this has mostly been hidden by the epitch provided in the descriptor but this is not something we do on GFX10 anymore. Now this has some draw-backs: 1. normalized coordinates don't work 2. Bounds checking uses slightly bigger bounds. 2 mostly is not an issue as we still ensure that they're within the texture memory and not overlapping other layers/mips, but we can't properly ignore writes. 1 is kinda dead in the water ... On the other hand I'd argue that using normalized coords & a filter for sampling a block view of a compressed format is extraordinarily useless. The old method we employed already had these drawbacks for everything except the base miplevel of the imageview. AFAICT this is the same tradeoff AMDVLK makes and no CTS test hits this. (once it does I think the HW is dead in the water ... Only workaround I can think of is shader processing which is hard because we don't know texture formats at compile time.) I also removed the extra calculations when the image has only 1 mip level because they ended up being a no-op in that case. CC: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2292 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2266 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2483 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2906 Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3607 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7090>	2020-10-12 21:00:00 +00:00
Samuel Pitoiset	66d7bb0f23	radv: fix adjusting vertex alpha AC_FETCH_FORMAT_NONE is not zero... Oops. Fixes: `b0829c6af7` ("radv: replace RADV_ALPHA_ADJUST by AC_FETCH_FORMAT") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7103>	2020-10-12 19:48:35 +02:00
Samuel Pitoiset	b32a8f83dc	radv: move lower_io_arrays_to_elements before lower_io_to_scalar_early nir_lower_io_arrays_to_elements lowers arrays or matrices to elements, which ends up to vectors for matrices, but a bunch of IO optimizations only work for scalars. Calling it before lower_io_to_scalar_early allows nir_link_opt_varyings to remove duplicated inputs and replace constant inputs. fossils-db (Navi10): Totals from 294 (0.22% of 136546) affected shaders: CodeSize: 861356 -> 860224 (-0.13%); split: -0.13%, +0.00% Instrs: 161972 -> 161832 (-0.09%); split: -0.09%, +0.00% Cycles: 1185680 -> 1185120 (-0.05%); split: -0.05%, +0.00% SMEM: 31422 -> 31424 (+0.01%) Copies: 9065 -> 9068 (+0.03%) Only Talos and Dark Souls 3 are affected. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7041>	2020-10-12 15:01:05 +00:00
Samuel Pitoiset	b0829c6af7	radv: replace RADV_ALPHA_ADJUST by AC_FETCH_FORMAT Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7065>	2020-10-12 13:13:40 +00:00
Samuel Pitoiset	5000c344cc	ac/llvm: move AC_FETCH_FORMAT to non-LLVM code While we are it, give it a name. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7065>	2020-10-12 13:13:40 +00:00
Rhys Perry	cf3b638f47	radv: remove RDR2 discard workaround The game appears to use HLSL, so this workaround now lives in SPIR-V -> NIR. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7062>	2020-10-12 11:07:39 +00:00
Bas Nieuwenhuizen	875ff8414f	radv/winsys: Expand scope of allbos lock. With us not creating a bo_list anymore, there is a problem if we delete a buffer between enumerating all buffers and doing the submission. Also changes this to a rwlock given the wider scope of the things under lock. (especialy some of the syncobj stuff is now under the lock) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7091>	2020-10-12 10:55:08 +00:00
Bas Nieuwenhuizen	ea778693bf	radv: Fix event write cmdbuffer allocation when tracing. The trace emit is another 7 words. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7091>	2020-10-12 10:55:08 +00:00
Samuel Pitoiset	98f538dfca	radv: remove one leftover TODO in the shader info pass Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7022>	2020-10-12 09:23:26 +02:00
Samuel Pitoiset	cec12d4f98	radv/llvm: reduce LDS size for tess by using NIR IO assigned locations To match ACO. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7022>	2020-10-12 09:23:26 +02:00
Samuel Pitoiset	47e26bf334	radv/llvm: reduce the ESGS itemsize by using NIR IO assigned locations There is no longer gaps in the ESGS ring. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7022>	2020-10-12 09:23:26 +02:00
Samuel Pitoiset	569b894835	radv/llvm: switch to NIR IO assigned locations To match ACO. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7022>	2020-10-12 09:23:25 +02:00
Samuel Pitoiset	6387341cce	ac/nir: pass the variable location to store_tcs_outputs It's actually simpler for the backend to know the variable location. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7022>	2020-10-12 09:23:25 +02:00
Samuel Pitoiset	8f8ee5b95b	ac,radv,radeonsi: stop multiplying driver_location by 4 It's no longer needed to do that. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7010>	2020-10-12 08:34:02 +02:00
Samuel Pitoiset	0a90dab6b4	radv/llvm: stop assigning driver_location in NIR->LLVM It's already assigned just after NIR linking shaders. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7010>	2020-10-12 08:33:57 +02:00
Marek Olšák	a4e4644eff	ac/surface: fix valgrind warnings in DCC retile tile lookups ==12920== Conditional jump or move depends on uninitialised value(s) ==12920== at 0x8F39391: util_fast_urem32 (fast_urem_by_const.h:71) ==12920== by 0x8F39391: hash_table_search (hash_table.c:285) ==12920== by 0x8B06D5D: ac_compute_dcc_retile_tile_indices (ac_surface.c:136) Fixes: `a37aeb128d` "amd/common: Cache intra-tile addresses for retile map." Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7055>	2020-10-09 23:13:40 +00:00
Rhys Perry	8e981453ed	radv: use radv_optimize_nir() less in radv_link_shaders() fossil-db (Navi): Totals from 11 (0.01% of 137413) affected shaders: CodeSize: 99372 -> 99480 (+0.11%) Instrs: 19119 -> 19110 (-0.05%) Cycles: 222144 -> 222000 (-0.06%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6891>	2020-10-09 15:48:00 +00:00
Rhys Perry	55254f241f	radv: move optimizations in shader_compile_to_nir() to after io_to_scalar This results in at least one less radv_optimize_nir() iteration. No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6891>	2020-10-09 15:47:59 +00:00
Bas Nieuwenhuizen	da132d802b	radv: Set fce metadata correctly on DCC initialization. The fce metadata can always be set to false as we don't care about the compressed clear color. Avoiding useless fast clear eliminates improves basemark performance by 1%-1.5%. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7005>	2020-10-09 13:46:49 +00:00
Timur Kristóf	5ae3656890	aco/ngg: Calculate workgroup size of NGG shaders. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:15 +02:00
Timur Kristóf	61280bb4b6	aco/ngg: Allocate NGG GS space early for const vertex/primitive counts. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:15 +02:00
Timur Kristóf	e8a0409d01	aco/ngg: Use more efficient LDS layout to help reduce bank conflicts. The LLVM backend has a trick which helps reduce LDS bank conflicts by swizzling the LDS address where each vertex is emitted. This commit implements the same thing for ACO. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:15 +02:00
Timur Kristóf	9bf92d4357	radv/aco: Enable NGG GS by default. ACO NGG GS now supports everything we need except streamout (aka. transform feedback), but we don't use NGG anyway when streamout is needed. Also add a note to the new features txt. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:15 +02:00
Timur Kristóf	dd73719856	aco/ngg: Add shader query support to NGG GS. In each GS thread, we calculate the number of "real" primitives that were emitted (points, lines, triangles, not strips). Then we accumulate the number of "real" primitives emitted by the entire threadgroup in GDS. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:15 +02:00
Timur Kristóf	df62c8fbea	aco/ngg: Place workgroup barrier outside control flow for NGG GS. Merged shaders have a workgroup barrier which makes sure that the first half is completed in every wave before the 2nd half is started. This barrier is located in divergent control flow, so that waves that don't have any invocations in the 2nd half can finish as early as possible. This is problematic for NGG GS because it has more workgroup barriers after the 2nd half. So, for NGG GS we need to put the barrier outside control flow because otherwise the waves that have 0 GS threads won't be able to wait for the waves which have non-zero GS threads. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:15 +02:00
Timur Kristóf	1129575d5e	aco/ngg: Implement NGG GS output. We store emitted GS vertices in LDS. Then, at the end of the shader, the emitted vertices are compacted and each thread loads a single vertex from LDS in order to export a primitive as needed, and the vertex attributes. The reason this is done is because there is an impedance mismatch between how API GS and the NGG HW works. API GS can emit an arbitrary number of vertices and primites in each thread, but NGG HW can only export one vertex per thread. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:15 +02:00
Timur Kristóf	62b5012ec3	aco/ngg: Implement workgroup reduce / exclusive scan for NGG GS. This function calculates two things at once: 1. The total number of vertices emitted by the threadgroup. 2. Exclusive scan of emitted vertex count accross the threadgroup. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:15 +02:00
Timur Kristóf	c29e288fb5	aco/ngg: Create LDS layout for NGG GS. For NGG GS, we need to store the following in LDS: 1. The ESGS ring, similarly to legacy ESGS. 2. Emitted vertices from the GS threads. 3. Temporary space used by the workgroup scan. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:15 +02:00
Timur Kristóf	2680329fb7	aco/ngg: Setup NGG GS. Make it possible for ACO to recognize when to use HW NGG GS. Also add a few notes about the various GS stages in the comments. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:15 +02:00
Timur Kristóf	9c3d8404de	aco/ngg: Allow NGG GS to create VS exports. NGG GS need to use the same instructions to export vertex attributes at the end. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Timur Kristóf	b67878f328	aco/ngg: Allow NGG GS to load per-vertex GS inputs. They work the same way as in legacy GS, so we can reuse that. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Timur Kristóf	8f25d9f821	aco/ngg: Allow NGG GS to store ES outputs. We can reuse the existing ES output code. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Timur Kristóf	b57b1a06e4	aco/ngg: Clean up and reorganize NGG VS/TES code. Make the NGG VS/TES code easier to follow, give better names to some functions and make ngg_nogs_early_prim_export a variable. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Timur Kristóf	3645a3106a	aco/ngg: Make primitive export packing less prone to error. Use lshl_or instead of lshl_add, which makes it more robust in handling -1 and -2 indices which will now just become null exports, which is what we want. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Timur Kristóf	0bfe0495c1	aco/ngg: Refactor ngg_emit_prim_export in preparation for NGG GS. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Timur Kristóf	b08ced08a2	aco/ngg: Refactor gs_alloc_req in preparation for NGG GS. Previously, this function inferred the vertex and primitive counts from the gs_tg_info shader argument, but in case of NGG GS, it will need to be calculated in runtime. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Timur Kristóf	ecfabfd606	aco: Add wave-specific opcode for s_lshl and s_flbit. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Timur Kristóf	57d8799284	aco: Optimize thread_id_in_threadgroup when there is just one wave. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Timur Kristóf	5e31fb49a3	aco: Use thread_id_in_threadgroup helper for ES outputs. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Timur Kristóf	924f816fe1	aco: Extract thread_id_in_threadgroup to a separate function. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Timur Kristóf	b1964ad4d6	aco: Extract lanecount_to_mask to a separate function. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00

1 2 3 4 5 ...

5983 commits