fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-01-09 23:30:13 +01:00

Author	SHA1	Message	Date
Qiang Yu	3507cdc59c	ac/nir: legacy vs/gs use nir_xfb_info to replace pipe_stream_output_info pipe_stream_output_info is built from nir_xfb_info, why not just use nir_xfb_info directly. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20015>	2022-11-29 03:28:42 +00:00
Samuel Pitoiset	505290dc44	ac/nir,radv: rework and fix NGG queries enables for VS/TES XFB queries need to be enabled with NGG streamout and VS/TES. Previously, the NGG lowering code relied on has_prim_query for XFB. This fixes failures with RADV_PERFTEST=ngg_streamout on GFX10.3 with the vkd3d-proton testsuite. Vulkan CTS is missing TES tests with XFB queries apparently. Cc: 22.3 mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19493>	2022-11-07 14:54:53 +00:00
Rhys Perry	140cefe95a	ac/nir: lower gfx11 vertex parameter exports Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19228>	2022-10-31 14:33:43 +00:00
Rhys Perry	93fb84237f	ac/nir: add ac_nir_lower_ngg_options These signatures were getting ridiculous. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19340>	2022-10-27 13:31:40 +00:00
Qiang Yu	e536d0fe4b	ac/nir/ngg,radv: move LDS layout calculation out of nir ngg lowering Use lds base load intrinsics in nir ngg lowering to get layout, left its calulation to driver. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18832>	2022-10-27 07:35:01 +00:00
Qiang Yu	54eea0e393	ac/nir/ngg: pass primitive_id_location as param for nogs lower radeonsi need to use packed driver location for all outputs, while radv need to use VARYING_SLOT_*. To meet both drivers' needs. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18832>	2022-10-27 07:35:01 +00:00
Rhys Perry	1c005e72f4	ac/nir: add legacy streamout and GS copy shader helpers Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19302>	2022-10-25 17:35:08 +00:00
Qiang Yu	188a7f9226	ac/nir/ngg: add query param to ac_nir_lower_ngg_gs radeonsi may disable it. gfx_level will also be used by latter vertex param export when gfx11. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17457>	2022-10-25 12:58:43 +00:00
Qiang Yu	97e1613b0e	ac/nir/ngg: use nir_load_provoking_vtx_in_prim_amd in ngg lower Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19166>	2022-10-20 06:53:56 +00:00
Qiang Yu	074f3216f2	ac/nir/ngg: support gs streamout Port from radeonsi. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17654>	2022-09-16 08:51:28 +00:00
Qiang Yu	5ec79f9899	ac/nir/ngg: nogs support streamout Port from radeonsi. Works on both GFX11 and GFX10. Although GFX10 can do atomic GDS add on all threads, now we just disable the NGG streamout for GFX10, so it's OK. There's a difference for the GFX11 implementation with radeonsi that we do all 4 buffer/stream info calc on a single thread. It's just because this is simple, we need to update GDS on a single thread anyway, and streamout is not that performance critical to loss a small amount of instruction. We may change to a better implementation when using register based streamout. When streamout enabled, ES threads need to save all vertex attributes to LDS besides position. This is because we don't know where in the streamout buffer to export the attributes to and wheter there are space in the streamout buffer. Streamout is done in primitives, so we need to check if there is space and where the current primitive should be written to by GDS atomic add, then in GS threads do the streamout. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17654>	2022-09-16 08:51:28 +00:00
Qiang Yu	f75452918b	ac/nir/ngg: support clipdist culling Port from radeonsi. Besides vertex position based primitive culling, clipdist attribute can also be used to cull a primitive. Normally it's used by fixed-pipeline, but when NGG we can treate it as a culling condition to filter out invisible primitive before fixed-pipeline. There are two kinds of clipdist: 1. user define a clip plane explicitly by glClipPlane(), fixed-pipeline calculate with vertex position to get clipdist, then cull. This is the legacy way. 2. Now GLSL define gl_ClipDistance/gl_CullDiatance so that user can calculate clipdist in any way he like. This implementation support both way. Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17651>	2022-08-26 05:50:30 +00:00
Qiang Yu	1bdeb961bd	ac/nir/ngg: add gs culling Port from radeonsi. Cull primitive after GS thread and before final vertex/primitive export. GS culling is like VS/TES culling which read out saved vertex positions of a primitive from LDS then call the primitive culling algorithm to check whether it's visiable or not, only passed primitives will be exported. Unlike the VS/TES culling that read vertex index of a primitive from VGPRs as shader args, GS will set a primitive complete flag for each last vertex of a primitive in LDS, so that vertex thread know the previous 1/2/3 vertex can form a primitive and do primitive culling. Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17651>	2022-08-26 05:50:30 +00:00
Qiang Yu	db0e9d3cab	ac/nir/ngg: support line culling Port from ac_llvm_cull.c Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17651>	2022-08-26 05:50:30 +00:00
Qiang Yu	f1f2c931a7	ac/nir/cull: support caller react when primitive is rejected Make accept_func optional, and return accpect result for caller react when primitive is rejected. This is for GS culling. Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17651>	2022-08-26 05:50:30 +00:00
Timur Kristóf	c721f751f2	ac/nir/ngg: Move LDS store of accepted flag into the inner branch. For primitives which are rejected based on only W and face, this will reduce the number of executed branches. Fossil DB stats on Navi 21: Totals from 60918 (45.16% of 134906) affected shaders: CodeSize: 160330564 -> 160086644 (-0.15%) Instrs: 30477385 -> 30477916 (+0.00%); split: -0.00%, +0.00% Latency: 139802763 -> 139587915 (-0.15%); split: -0.15%, +0.00% InvThroughput: 21198444 -> 21184261 (-0.07%); split: -0.07%, +0.00% SClause: 749811 -> 749810 (-0.00%) Copies: 2701482 -> 2762930 (+2.27%); split: -0.00%, +2.28% Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17870>	2022-08-05 22:10:28 +00:00
Marek Olšák	4f622d62d0	ac/nir: add ac_nir_lower_resinfo Emulating image_get_resinfo should be faster than using the hw. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17693>	2022-08-03 17:44:15 +00:00
Qiang Yu	e9f1f115fa	ac/nir: add triangle_strip_adjacency_fix to gs input lower From radeonsi. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16788>	2022-06-27 11:32:43 +08:00
Qiang Yu	109eb378e5	ac/nir: change es output lower param to esgs_itemsize radeonsi may add extra dword to the stride, so let's pass it directly. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16788>	2022-06-27 11:32:34 +08:00
Qiang Yu	8b5e8b2af7	ac/nir: remove unused param num_reserved_es_outputs from gs input lower Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16788>	2022-06-27 11:32:30 +08:00
Qiang Yu	d00845faf4	ac/nir: add no_input_lds_space param to hs output lower This is used by radeonsi to save some lds space when all LS output is passed by register. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16705>	2022-06-27 02:38:21 +00:00
Qiang Yu	ae9b02b4d0	ac/nir: add wave_size parameter to ac_nir_lower_hs_outputs_to_mem Used by radeonsi and radv to reflect true wave size used, not minimal size. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16705>	2022-06-27 02:38:21 +00:00
Qiang Yu	18d51831a8	ac/nir: add pass_tessfactors_by_reg param to hs output lower radeonsi won't emit tess factor in the lower pass, need to keep the output for llvm backend to pass it as parameter. This is used by radeonsi for an optimization to save LDS write. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16705>	2022-06-27 02:38:21 +00:00
Qiang Yu	6ccb9634de	ac/nir: use nir_intrinsic_load_hs_out_patch_data_offset_amd in tess lower radeonsi load this from SGPR arg, can't use static value because TCS output and TES input may not match (TCS output is not a key for TES) and determined in runtime. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16705>	2022-06-27 02:38:21 +00:00
Qiang Yu	2ba6d2b107	ac/nir: remove unused parameter in tes input lower Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16705>	2022-06-27 02:38:21 +00:00
Qiang Yu	3aa70d92ce	radv: no need to do gs_alloc_req for newer chips in ngg vs/tes Copy from radeonsi. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17130>	2022-06-27 02:12:13 +00:00
Samuel Pitoiset	fe57fe1fd8	ac/nir/ngg: count the number of generated primitives for VS and TES Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15639>	2022-06-09 08:02:39 +00:00
Timur Kristóf	b664279755	ac/nir/ngg: Use mesh shader scratch ring when outputs don't fit LDS. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16737>	2022-06-08 08:43:51 +00:00
Timur Kristóf	f7f2770e72	ac/nir: Add remappability to tess and ESGS I/O lowering passes. This will be used for radeonsi to map common I/O location to fixed slots agreed by different shader stages. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16418>	2022-06-07 01:40:14 +00:00
Qiang Yu	6a95452ddf	ac/nir: use nir_intrinsic_load_lshs_vertex_stride_amd For radeonsi which pass this value by argument. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16418>	2022-06-07 01:40:14 +00:00
Timur Kristóf	c69b771e35	radv, ac/nir: Fix multiview layer export for mesh shaders. Unfortunately, radv_lower_multiview is not suitable for mesh shaders because it can't know the mapping between API mesh shader invocations and output primitives. Additionally, when lowering view id to layer, it must be created as a per-primitive PS input. Fixes: `d32656bc65` Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16504>	2022-05-31 07:58:29 +00:00
Marek Olšák	39800f0fa3	amd: change chip_class naming to "enum amd_gfx_level gfx_level" This aligns the naming with PAL. Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Pierre-Eric Pellou-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16469>	2022-05-13 14:56:22 -04:00
Timur Kristóf	7de3034897	ac/nir: Add I/O lowering for task and mesh shaders. Task shaders store their output payload to VRAM where mesh shaders read from. There are two ring buffers: 1. Draw ring: this is where mesh dispatch sizes and the ready bit are stored. 2. Payload ring: this is where the optional payload is stored (up to 16K per task workgroup). Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14929>	2022-05-12 00:29:51 +00:00
Timur Kristóf	212f183c1f	ac/nir: Remove now-superfluous ac_nir_lower_tess_to_const. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13155>	2022-05-10 17:16:03 +00:00
Timur Kristóf	719678f891	ac/nir: Add ac_nir_load_arg helper for shader arguments. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13155>	2022-05-10 17:16:03 +00:00
Marek Olšák	11c28d9798	ac: add ac_nir_optimize_outputs, a NIR version of ac_optimize_vs_outputs ac_optimize_vs_outputs is an LLVM IR pass, and it will be replaced by this. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14414>	2022-04-22 22:21:11 +00:00
Rhys Perry	61ac5acca3	radv,ac/nir: lower global access to _amd global access intrinsics fossil-db (Sienna Cichlid): Totals from 400 (0.30% of 134621) affected shaders: VGPRs: 18696 -> 18688 (-0.04%) CodeSize: 2031348 -> 1946640 (-4.17%) Instrs: 374703 -> 360226 (-3.86%) Latency: 4200727 -> 4108628 (-2.19%); split: -2.20%, +0.01% InvThroughput: 1059935 -> 1029441 (-2.88%); split: -2.88%, +0.00% VClause: 5777 -> 5771 (-0.10%) SClause: 11890 -> 10891 (-8.40%); split: -8.57%, +0.17% Copies: 34035 -> 33259 (-2.28%); split: -2.98%, +0.70% Branches: 11108 -> 11100 (-0.07%); split: -0.08%, +0.01% PreSGPRs: 15999 -> 15942 (-0.36%); split: -0.44%, +0.08% PreVGPRs: 16994 -> 16970 (-0.14%) fossil-db (Polaris10): Totals from 400 (0.29% of 135668) affected shaders: SGPRs: 23799 -> 22919 (-3.70%); split: -4.30%, +0.61% VGPRs: 18480 -> 18472 (-0.04%) CodeSize: 2090316 -> 2041592 (-2.33%) Instrs: 395461 -> 385747 (-2.46%); split: -2.46%, +0.00% Latency: 5045768 -> 5020196 (-0.51%); split: -0.53%, +0.02% InvThroughput: 2694320 -> 2689886 (-0.16%); split: -0.23%, +0.07% VClause: 5982 -> 5968 (-0.23%) SClause: 12064 -> 10823 (-10.29%); split: -10.33%, +0.04% Copies: 48233 -> 48322 (+0.18%); split: -0.47%, +0.65% PreSGPRs: 16409 -> 16358 (-0.31%); split: -0.39%, +0.08% fossil-db (Pitcairn): Totals from 400 (0.29% of 135668) affected shaders: SGPRs: 22431 -> 22215 (-0.96%); split: -2.60%, +1.64% VGPRs: 18776 -> 18560 (-1.15%); split: -1.21%, +0.06% CodeSize: 2104440 -> 2017708 (-4.12%) MaxWaves: 2363 -> 2367 (+0.17%) Instrs: 413099 -> 397446 (-3.79%) Latency: 5507707 -> 5450251 (-1.04%); split: -1.12%, +0.07% InvThroughput: 2838867 -> 2786903 (-1.83%); split: -1.83%, +0.00% VClause: 10334 -> 10097 (-2.29%) SClause: 12346 -> 11005 (-10.86%); split: -10.89%, +0.02% Copies: 54034 -> 52065 (-3.64%); split: -3.99%, +0.35% PreSGPRs: 17916 -> 17857 (-0.33%); split: -0.40%, +0.07% PreVGPRs: 16917 -> 16893 (-0.14%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14124>	2022-04-13 16:23:35 +00:00
Marek Olšák	116a05c721	ac: move ac_exp_param.h to ac_nir.h Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14266>	2022-01-05 12:46:31 +00:00
Timur Kristóf	7aa42e023a	ac/nir/ngg: Lower NV mesh shaders to NGG semantics. Lower mesh shader outputs to shared memory. At the end of the shader, read the outputs from shared memory and export their values as NGG expects. We allocate separate shared memory (LDS) areas for per-vertex, per-primitive outputs, primitive indices, primitive count. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13580>	2021-12-31 13:05:09 +00:00
Samuel Pitoiset	b52aaea630	radv: remove unnecessary ac_nir_ngg_config output struct Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13134>	2021-10-04 08:55:19 +00:00
Samuel Pitoiset	52e91f7640	radv: move ngg passthrough determination earlier Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13134>	2021-10-04 08:55:19 +00:00
Samuel Pitoiset	2ce78a30ff	move: move ngg lds bytes determination earlier Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13134>	2021-10-04 08:55:19 +00:00
Samuel Pitoiset	90858dd718	radv: move ngg early prim export determination earlier Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13134>	2021-10-04 08:55:19 +00:00
Rhys Perry	24501b5452	radv: move ngg culling determination earlier Co-Authored-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13134>	2021-10-04 08:55:19 +00:00
Timur Kristóf	a7f2faea46	ac/nir: Emit edge flag instructions conditionally. They are not needed by RADV but will be needed by RadeonSI. Fossil DB results on Sienna Cichlid (with NGGC on): Totals from 56917 (44.24% of 128647) affected shaders: VGPRs: 1982664 -> 1975936 (-0.34%); split: -0.43%, +0.09% CodeSize: 152790880 -> 149510316 (-2.15%); split: -2.15%, +0.00% MaxWaves: 1617984 -> 1621900 (+0.24%) Instrs: 29272825 -> 28907038 (-1.25%); split: -1.26%, +0.01% Latency: 128744182 -> 127565678 (-0.92%); split: -1.14%, +0.22% InvThroughput: 20125915 -> 19805168 (-1.59%); split: -1.63%, +0.03% VClause: 521312 -> 519804 (-0.29%); split: -0.77%, +0.48% SClause: 688861 -> 688897 (+0.01%); split: -0.04%, +0.05% Copies: 3205421 -> 3177799 (-0.86%); split: -1.68%, +0.82% Branches: 1181457 -> 1183147 (+0.14%); split: -0.03%, +0.17% PreVGPRs: 1626681 -> 1595406 (-1.92%) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12998>	2021-09-23 16:57:56 +02:00
Timur Kristóf	f4a65e5628	ac/nir/nggc: Only repack arguments that are needed. Don't repack everything, only what is actually used. The goal of this commit is primarily to remove unnecessary LDS stores and loads. In addition to that, it also gets rid of a few VALU instructions and reduces VGPR use. Fossil DB stats on Sienna Cichlid with NGGC on: Totals from 6951 (5.40% of 128647) affected shaders: VGPRs: 206056 -> 205360 (-0.34%); split: -0.79%, +0.45% CodeSize: 12344568 -> 12269312 (-0.61%); split: -0.62%, +0.01% MaxWaves: 211206 -> 212196 (+0.47%) Instrs: 2319459 -> 2308483 (-0.47%); split: -0.50%, +0.03% Latency: 7220829 -> 7164721 (-0.78%); split: -1.21%, +0.43% InvThroughput: 1051450 -> 1049191 (-0.21%); split: -0.36%, +0.15% VClause: 25794 -> 25445 (-1.35%); split: -1.97%, +0.61% SClause: 39192 -> 39277 (+0.22%); split: -0.21%, +0.43% Copies: 315756 -> 313404 (-0.74%); split: -1.17%, +0.42% Branches: 127878 -> 127879 (+0.00%); split: -0.00%, +0.00% PreVGPRs: 168029 -> 160162 (-4.68%) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12246>	2021-09-01 14:45:14 +00:00
Timur Kristóf	8341af5109	radv, aco, ac/nir: Tweak position export scheduling for NGG culling. The result is about +5-ish fps in Doom Eternal. It turns out that the location of position exports matters more than we thought, and it's actually better to keep them at the bottom for culling shaders rather than schedule it up to the top. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10525>	2021-07-13 23:56:33 +00:00
Timur Kristóf	fc1fabbabf	ac/nir: Analyze culling shaders to remember which inputs are used when. These will be useful for some optimizations. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10525>	2021-07-13 23:56:33 +00:00
Timur Kristóf	e97f0463a8	ac/nir: Implement NGG deferred attribute culling in NIR. Culling is traditionally done by the rasterizer, but that can be a bottleneck when an app creates a large number of primitives. Eg. a lot of tiny triangles reduce the rasterziation efficiency. NGG makes it possible for the shader to check primitives and delete those that it can prove are not needed. After this is done, we have to repack the surviving invocations so they remain compact. This also saves bandwidth, because some memory loads are only executed by those vertices that survived the culling. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10525>	2021-07-13 23:56:33 +00:00
Timur Kristóf	651a3da1b5	ac/nir: Add a NIR port of ac_llvm_cull. The algorithms were originally implemented by Marek Olšák, hence the copyright to AMD. This commit just ports the LLVM based implementation to NIR, using the new intrinsics added earlier. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10525>	2021-07-13 23:56:33 +00:00

1 2

55 commits