fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-22 13:30:12 +01:00

Author	SHA1	Message	Date
Marek Olšák	f156abd2a7	radeonsi: simplify how broadcast_last_cbuf is implemented for PS epilogs We don't need to look at the framebuffer state and record how many color buffers to write. Instead, we can deduce which color buffers are enabled from spi_shader_col_format, which already does the right thing. So PS epilogs only need a single bool flag that determines whether all enabled color buffers should be written. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32910>	2025-01-29 07:19:41 +00:00
Marek Olšák	3dcbf743c4	radeonsi: implement replacing frag_coord with pixel_coord at draw time This adds an option into the prolog key to replace frag_coord.xy with pixel_coord when sample shading is disabled, which is most of the time. This reduces the number of input VGPRs. It's already implement in ac_nir_lower_ps_early for monolithic shaders and the PS prolog in ACO, so this just implements it for the PS prolog in LLVM IR. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32910>	2025-01-29 07:19:40 +00:00
Marek Olšák	d7d4d56f5b	ac,aco,radeonsi: replace SampleMaskIn with 1 << SampleID if full sample shading Since the sample mask is always 1 << sample_id with full sample shading, just use that instead of loading sample_mask_in. Set it to 0 if it's a helper invocation. This removes the sample mask input VGPR. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33024>	2025-01-25 12:20:25 -05:00
Marek Olšák	b1fc34f290	radeonsi: sample shading state fixes - really update sample shading state when it's changed - reduce log state bits in the shader key to 2 because we don't support 16x EQAA - exit early from si_update_ps_iter_samples if ps_iter_sample has the same value since the last call - set missing wqm for the PS prolog (this might fix tests) Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33024>	2025-01-25 12:20:25 -05:00
Marek Olšák	03ad2bc782	radeonsi: make many shader functions static or move them to .c files - many non-inline functions are only used in 1 .c file: make them static - some inline functions are only use in 1 .c file: move them there Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33046>	2025-01-16 02:58:03 +00:00
Marek Olšák	b05fa7d575	radeonsi/gfx12: set DIS_PG_SIZE_ADJUST_FOR_STRIP after shader compilation Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32957>	2025-01-09 23:37:55 -05:00
Marek Olšák	1b405a12e0	radeonsi: only set BREAK_PRIMGRP/WAVE_AT_EOI when TES/GS need PrimID sysval after TES Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32957>	2025-01-09 23:37:52 -05:00
Marek Olšák	f06a103eea	radeonsi: don't set BREAK_PRIMGRP/WAVE_AT_EOI when tessellation is disabled It's not required and it decreases performance. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32957>	2025-01-09 22:44:44 -05:00
Georg Lehmann	aee0c7274c	amd: switch to FRONT_FACE_ALL_BITS(0) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32791>	2024-12-30 22:31:34 +00:00
Marek Olšák	9b7ea720c9	radeonsi: use nir->info instead of sel->info.base sel->info is out of date after shader variant optimizations. We need to stop using it. Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>	2024-12-24 12:02:20 +00:00
Marek Olšák	7ddb28f447	radeonsi: remove some uses of enum pipe_shader_type it's identical to gl_shader_stage Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>	2024-12-24 12:02:20 +00:00
Marek Olšák	6a1bdf2f78	radeonsi/gfx12: tune streamout performance Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>	2024-12-24 12:02:20 +00:00
Marek Olšák	8440184dfd	radeonsi: make NGG streamout output primitive type known at compile time This compiles an optimized shader variant for NGG streamout where the output primitive is known at compile time. This allows putting stores for all vertices into the same VMEM clause. Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>	2024-12-24 12:02:20 +00:00
Marek Olšák	5003465c42	radeonsi: eliminate shader code computing killed Z/S/samplemask PS outputs Compile a monolithic optimized shader to do that, and clean up the comments. Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>	2024-12-24 12:02:20 +00:00
Marek Olšák	58132d6fc8	radeonsi: implement nir_opt_frag_depth using kill_z instead of the NIR pass This uses si_shader_info to store whether gl_FragDepth can be removed, and it uses the kill_z epilog flag to do the removal without recompilation. Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>	2024-12-24 12:02:20 +00:00
Marek Olšák	b56f47611a	radeonsi: fix alpha-to-coverage + alpha-to-one used together for gfx6-10.3 It works exactly like gfx11 except that COVERAGE_TO_MASK_ENABLE must be 1 to indicate that alpha for alpha-to-coverage should be read from mrtz.a. Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>	2024-12-24 12:02:20 +00:00
Marek Olšák	08abddd235	radeonsi/gfx11: fix alpha-to-coverage + alpha-to-one used together alpha-to-coverage must be applied before alpha-to-one. The only way to do that is to export alpha for alpha-to-coverage via mrtz, and export 1 via mrt0.a. ACO and monolithic shader support is already in place thanks to RADV, so we only need to change the LLVM PS epilog and the shader key. Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>	2024-12-24 12:02:20 +00:00
Marek Olšák	de996ac481	radeonsi: kill Z and stencil PS outputs if depth or stencil is disabled This adds kill_z and kill_stencil flags to the shader PS epilog key, which removes those outputs if depth or stencil are disabled. It must be implemented in: * ACO PS epilog * LLVM PS epilog * ac_nir_lower_ps for monolithic shaders Some of the samplemask code wasn't completely correct, but probably harmless. Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>	2024-12-24 12:02:20 +00:00
Marek Olšák	823e9e846e	radeonsi: switch to the new TCS LDS/offchip size computation The new TCS LDS size should be less than what it was before. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31673>	2024-12-18 11:07:59 +00:00
Marek Olšák	85c20def94	ac,radv,radeonsi: enable TCS input reads from VGPRs for all compatible loads Cross-invocation TCS input access doesn't prevent same-invocation access. This improves shaders that use both for the same inputs. Also, if some components of a vec4 slot only use same-invocation access and other components only use cross-invocation access (it's possible after compaction), this takes the VGPR path for the components with same-invocation access, which didn't happen previously because all masks only describe whole vec4s. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31673>	2024-12-18 11:07:59 +00:00
Alyssa Rosenzweig	41076b2a55	radeonsi: use mesa_prim_has_adjacency Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32529>	2024-12-12 21:16:13 +00:00
Marek Olšák	680f7afe0b	radeonsi: don't use nir_io_dont_optimize because it's deprecated There is a new environment variable that can be used instead. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32257>	2024-11-20 21:08:30 +00:00
Marek Olšák	06292538ae	radeonsi: add helper si_shader_culling_enabled it will contain more logic Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32257>	2024-11-20 21:08:29 +00:00
Marek Olšák	d7415d3717	radeonsi: clean up and rename gfx10_edgeflags_have_effect Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32257>	2024-11-20 21:08:29 +00:00
Marek Olšák	6988967a1f	radeonsi: rewrite/replace gfx10_ngg_get_vertices_per_prim Reuse si_get_input_prim (which is similar) and split it into 2 functions: - si_get_output_prim_simplified - si_get_num_vertices_per_output_prim Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32257>	2024-11-20 21:08:29 +00:00
Marek Olšák	963a84677e	radeonsi: optionally return MESA_PRIM_UNKNOWN from si_get_input_prim it will be used later Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32257>	2024-11-20 21:08:29 +00:00
Marek Olšák	691a9ccb33	radeonsi: prepare for making SI_NGG_CULL_TRIANGLES/LINES VS only, rename them They will have no effect on TES and GS, so this will make it more obvious. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32257>	2024-11-20 21:08:29 +00:00
Marek Olšák	51aa1d8381	radeonsi: fix gl_FrontFace elimination when one side is culled Fixes: `55d81214c9` - radeonsi: replace gl_FrontFacing with a constant if one side is always culled Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32186>	2024-11-19 11:43:20 +00:00
Marek Olšák	5be9d76861	radeonsi: fix an assertion failure in si_shader_ps with AMD_DEBUG=mono assert(!shader->key.ps.part.prolog.force_persp_center_interp \|\| (!G_0286CC_PERSP_SAMPLE_ENA(input_ena) && !G_0286CC_PERSP_CENTROID_ENA(input_ena))); failed when all FS inputs have been eliminated by optimizations, which causes LLVM to set PERSP_SAMPLE_ENA because at least 1 of those must be enabled, which this code didn't expect. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32186>	2024-11-19 11:43:20 +00:00
Marek Olšák	8deb32ac2e	radeonsi: split outputs_written_before_tes_gs into ls_es_* and tcs_* masks these will have different values later Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32171>	2024-11-16 21:59:29 -05:00
Marek Olšák	40d9616bd3	radeonsi: don't pad esgs_vertex_stride if it's 0 so that we don't allocate any LDS for ES->GS varyings if it's unused. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31291>	2024-09-20 19:49:44 +00:00
Marek Olšák	ce72376641	radeonsi: rename SI_CONTEXT_* flags to SI_BARRIER_* flags some of the definition names are changed completely Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31193>	2024-09-17 20:44:58 +00:00
Marek Olšák	834aa812ea	radeonsi: rename si_context::flags -> barrier_flags Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31193>	2024-09-17 20:44:58 +00:00
Marek Olšák	dac99e75af	radeonsi: rename "cache_flush" -> "barrier" Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31193>	2024-09-17 20:44:58 +00:00
Marek Olšák	1d5ffb13d6	radeonsi: add ACQUIRE_MEM, RELEASE_MEM PWS packet helpers Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31168>	2024-09-14 11:03:44 -04:00
Marek Olšák	1a1138817c	radeonsi: add a new PM4 helper radeon_event_write Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31168>	2024-09-14 11:03:44 -04:00
Marek Olšák	b7136d0890	radeonsi: pass TCS inputs_read mask to LS output lowering on GFX9 + monolithic This will allocate less LDS for LS outputs if there are holes between varyings when we have monolithic merged LS+TCS. (it removes the holes) There are 2 steps to this: - add helper si_shader_lshs_vertex_stride and use it everywhere - pass the TCS inputs_read bitmask instead of the "map" callback to si_lower_ls_outputs_mem Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30962>	2024-09-05 19:54:29 +00:00
Samuel Pitoiset	80e8e18cc6	ac: add ac_gfx103_get_cu_mask_ps() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30789>	2024-08-27 14:14:57 +00:00
Qiang Yu	1ee612e1ac	radeonsi: use wave64 for KHR_shader_subgroup enabled shader Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30610>	2024-08-26 10:46:20 +08:00
Qiang Yu	a78d1d49e6	radeonsi: consider both stages to determine merged shader wave_size Previously we determine wave_size of merged shader stages separately, and ignore the condition which may cause them to be different. Now we determine the wave_size of the TCS/GS part first, then use the wave_size for VS/TES part. So that we can condider the previous shader stage's information when determine the wave_size of TCS/GS, and two stages in the merged shader can affect each other's wave_size. This requires si_shader_selector to have two kinds of main part for wave32 and wave64 when part mode, to be combined with other shader part with various wave size. This also enables merged shader stages with different si_shader_info->has_divergent_loop to use wave32. We'll add another condition for KHR_shader_subgroup latter. Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30610>	2024-08-26 10:46:20 +08:00
Qiang Yu	196d91ed78	radeonsi: remove NULL check in si_determine_wave_size This function is always called with non-NULL shader now. Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30610>	2024-08-26 10:46:20 +08:00
Marek Olšák	0e27df4521	radeonsi/gfx12: fix VS output corruption with streamout We increased VS_EXPORT_COUNT to 8 for streamout in gfx10_shader_ngg, but we forgot to increase the attribute ring stride, causing all waves except the first one to get corrupted VS outputs. Fixes: `f703dfd1bb` - radeonsi: add gfx12 Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30503>	2024-08-05 19:35:39 +00:00
Marek Olšák	a5b4ae67ae	ac: add radeon_info::has_scratch_base_registers Fixes: `3b0bfd254f` - radeonsi/gfx11: make flat_scratch changes for compute Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30071>	2024-07-15 13:52:25 -04:00
Marek Olšák	c353394a21	radeonsi: replace si_shader::scratch_bo with scratch_va, don't set it on gfx11+ This removes the unnecessary buffer reference and improves this fragile code. Fixes: `3b0bfd254f` - radeonsi/gfx11: make flat_scratch changes for compute Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11463 Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30071>	2024-07-15 13:52:18 -04:00
Pierre-Eric Pelloux-Prayer	a7a1e3d329	radeonsi: fix crash in si_update_tess_io_layout_state for gfx8 and earlier si_set_patch_vertices was only called if tcs.current was non-NULL but this condition is not enough for GFX9+ since vs is used as ls. Add a check in si_update_tess_io_layout_state instead, and set sctx->do_update_shaders for case where the ls_current is not yet available. This fix crashes on GFX6. Cc: mesa-stable Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29876>	2024-07-09 15:07:27 +02:00
Patrick Lerda	301a3bacce	radeonsi: fix assert triggered on gfx6 after the tessellation update This change updates the affected calls to the proper function which is radeon_set_config_reg(). For instance, this issue is triggered with "piglit/bin/textureSize tes isampler2DMSArray -auto -fbo": vertex-program-two-side: ../src/gallium/drivers/radeonsi/si_state_shaders.cpp:4981: void si_emit_spi_ge_ring_state(si_context*, unsigned int): Assertion `(0x008988) >= CIK_UCONFIG_REG_OFFSET && (0x008988) < CIK_UCONFIG_REG_END' failed. Fixes: `bd71d62b8f` ("radeonsi: program tessellation rings right before draws") Signed-off-by: Patrick Lerda <patrick9876@free.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29645>	2024-06-11 14:01:21 +00:00
Marek Olšák	fe7a4ed708	radeonsi: use shader_info::use_aco_amd to determine whether to use ACO It's set by si_nir_scan_shader, so we need to use it after that. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917>	2024-06-08 05:48:11 +00:00
Samuel Pitoiset	428601095c	ac,radeonsi import PM4 state from RadeonSI Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29452>	2024-06-06 20:26:47 +00:00
Mike Blumenkrantz	2aaa6ebba1	build/amd: add amd-use-llvm build option this allows amd drivers to disable llvm support while still allowing llvmpipe/lavapipe to be built by disabling llvm support in amd drivers, the load times for these drivers decreases by 5-10ms Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Mike Lothian <mike@fireburn.co.uk> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28969>	2024-05-30 19:05:00 +00:00
Marek Olšák	90b0925588	radeonsi: constify struct pipe_vertex_buffer * Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29053>	2024-05-15 06:42:34 +00:00

1 2 3 4 5 ...

271 commits