fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-22 13:30:12 +01:00

Author	SHA1	Message	Date
Marek Olšák	be8977811b	ac/nir: remove shader_info parameter from ac_nir_compute_tess_wg_info Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34544>	2025-04-19 22:55:00 -04:00
Marek Olšák	d82eda72a1	ac/gpu_info: move HS info into radeon_info Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34544>	2025-04-19 22:55:00 -04:00
Marek Olšák	c057d9105f	ac/gpu_info: add total_tess_ring_size Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34544>	2025-04-19 22:54:59 -04:00
Samuel Pitoiset	bc811a602e	radeonsi: fix configuring compute scratch Missed the two different variables for graphics vs compute. Fixes: `e433a57650` ("ac,radeonsi: rework computing scratch wavesize and tmpring register") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34586>	2025-04-18 06:50:16 +00:00
Samuel Pitoiset	e433a57650	ac,radeonsi: rework computing scratch wavesize and tmpring register To be re-used by RADV. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34549>	2025-04-17 10:35:40 +00:00
Marek Olšák	7f7d6deb18	radeonsi: add ACO-specific main shader parts We can't have merged shaders where the first part is compiled using ACO and the second part is compiled using LLVM. Add ACO-specific main shader parts to fix that. This happens when ACO is enabled for gfx12 streamout where GS can be paired with a previous shader compiled by LLVM. Fixes: `8ba718fb7d` - radeonsi/gfx12: use ACO for streamout because it's faster Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34491>	2025-04-14 22:44:13 +00:00
Marek Olšák	4865ac57cc	radeonsi: make si_shader_selector::main_shader_part_* an iterable union for the next commit Fixes: `8ba718fb7d` - radeonsi/gfx12: use ACO for streamout because it's faster Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34491>	2025-04-14 22:44:13 +00:00
Marek Olšák	bafab3324e	radeonsi: reflect blitter VS in si_context::num_vertex_elements Set it to 0 if the VS doesn't use VBOs. This fixes an assertion failure. Fixes: `7bf5d2ce75` - radeonsi: add assertion requiring binding vertex elements before vertex_buffers Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12698 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33482>	2025-03-06 21:10:53 +00:00
Marek Olšák	36ccc300d8	radeonsi: enable NGG culling when the shader writes the viewport index Only W and face culling is enabled. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33482>	2025-03-06 21:10:52 +00:00
Timur Kristóf	94996d546c	nir: Don't include the full nir.h when not necessary. Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33439>	2025-02-12 22:33:07 +01:00
Marek Olšák	82047fa82f	amd: drop support for LLVM 15, 16, 17 Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33211>	2025-02-01 04:22:30 +00:00
Marek Olšák	19907a24ec	radeonsi: validate BITSET_TEST_RANGE_INSIDE_WORD assertion at compile time This will prevent accidental crashes and hangs because of how we define tracked enums. The reg_enum parameter must be a compile-time constant. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32910>	2025-01-29 07:19:59 +00:00
Marek Olšák	e0d715c626	radeonsi: set gl_FragCoord to pixel center to fix GLCTS failures SPI_BARYC_CNTL is moved to the preamble because it's always 0. We set frag_coord_is_center for the NIR pass to indicate that sample_pos should be lowered differently. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32910>	2025-01-29 07:19:58 +00:00
Marek Olšák	3424cdadf5	radeonsi: fix interpolateAt* with non-GL4 ARB_sample_shading There is no test for this, but it's been broken. ARB_sample_shading doesn't set fs.uses_sample_shading in shader_info, which causes us to enter this path to force per-sample interpolation, but doing so breaks the shader if the PS prolog is used. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32910>	2025-01-29 07:19:57 +00:00
Marek Olšák	65398d571b	radeonsi: ignore pipe_rasterizer_state::force_persample_interp It just indicates that sample shading is enabled, which we were checking already. The state is redundant. Just check shader_info::fs::uses_sample_shading. ARB_sample_shading (GL3.3) doesn't set fs.uses_sample_shading in shader_info (which is for GL4.0), and that's why we have this codepath that forces per-sample interpolation. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32910>	2025-01-29 07:19:57 +00:00
Marek Olšák	1ff790a4f8	radeonsi: implement replacement of sample_mask_in with helper_invocation This just implements it in the PS prolog and LLVM IR (ACO already implements it), and enables it for monolithic shaders where it's already implemented in ac_nir_lower_ps_early. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32910>	2025-01-29 07:19:57 +00:00
Marek Olšák	e5ee15a42e	radeonsi: gather PS inputs from shader variant NIR This further reduces dependence on si_shader_info. union si_ps_input_info is added because we don't need usage_mask in there. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32910>	2025-01-29 07:19:55 +00:00
Marek Olšák	0eaff1ace8	radeonsi: set SHARED_VGPR_CNT for gfx shaders for ACO Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32910>	2025-01-29 07:19:55 +00:00
Marek Olšák	9e3033e071	radeonsi: move/rewrite PS color input gathering for shader variants This removes duplicated gathering from 3 places for shader variants, and adds it where it should be, which is before late optimizations and late lowering passes, which is where we want it for the radeonsi linker. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32910>	2025-01-29 07:19:49 +00:00
Marek Olšák	f156abd2a7	radeonsi: simplify how broadcast_last_cbuf is implemented for PS epilogs We don't need to look at the framebuffer state and record how many color buffers to write. Instead, we can deduce which color buffers are enabled from spi_shader_col_format, which already does the right thing. So PS epilogs only need a single bool flag that determines whether all enabled color buffers should be written. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32910>	2025-01-29 07:19:41 +00:00
Marek Olšák	3dcbf743c4	radeonsi: implement replacing frag_coord with pixel_coord at draw time This adds an option into the prolog key to replace frag_coord.xy with pixel_coord when sample shading is disabled, which is most of the time. This reduces the number of input VGPRs. It's already implement in ac_nir_lower_ps_early for monolithic shaders and the PS prolog in ACO, so this just implements it for the PS prolog in LLVM IR. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32910>	2025-01-29 07:19:40 +00:00
Marek Olšák	d7d4d56f5b	ac,aco,radeonsi: replace SampleMaskIn with 1 << SampleID if full sample shading Since the sample mask is always 1 << sample_id with full sample shading, just use that instead of loading sample_mask_in. Set it to 0 if it's a helper invocation. This removes the sample mask input VGPR. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33024>	2025-01-25 12:20:25 -05:00
Marek Olšák	b1fc34f290	radeonsi: sample shading state fixes - really update sample shading state when it's changed - reduce log state bits in the shader key to 2 because we don't support 16x EQAA - exit early from si_update_ps_iter_samples if ps_iter_sample has the same value since the last call - set missing wqm for the PS prolog (this might fix tests) Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33024>	2025-01-25 12:20:25 -05:00
Marek Olšák	03ad2bc782	radeonsi: make many shader functions static or move them to .c files - many non-inline functions are only used in 1 .c file: make them static - some inline functions are only use in 1 .c file: move them there Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33046>	2025-01-16 02:58:03 +00:00
Marek Olšák	b05fa7d575	radeonsi/gfx12: set DIS_PG_SIZE_ADJUST_FOR_STRIP after shader compilation Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32957>	2025-01-09 23:37:55 -05:00
Marek Olšák	1b405a12e0	radeonsi: only set BREAK_PRIMGRP/WAVE_AT_EOI when TES/GS need PrimID sysval after TES Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32957>	2025-01-09 23:37:52 -05:00
Marek Olšák	f06a103eea	radeonsi: don't set BREAK_PRIMGRP/WAVE_AT_EOI when tessellation is disabled It's not required and it decreases performance. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32957>	2025-01-09 22:44:44 -05:00
Georg Lehmann	aee0c7274c	amd: switch to FRONT_FACE_ALL_BITS(0) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32791>	2024-12-30 22:31:34 +00:00
Marek Olšák	9b7ea720c9	radeonsi: use nir->info instead of sel->info.base sel->info is out of date after shader variant optimizations. We need to stop using it. Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>	2024-12-24 12:02:20 +00:00
Marek Olšák	7ddb28f447	radeonsi: remove some uses of enum pipe_shader_type it's identical to gl_shader_stage Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>	2024-12-24 12:02:20 +00:00
Marek Olšák	6a1bdf2f78	radeonsi/gfx12: tune streamout performance Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>	2024-12-24 12:02:20 +00:00
Marek Olšák	8440184dfd	radeonsi: make NGG streamout output primitive type known at compile time This compiles an optimized shader variant for NGG streamout where the output primitive is known at compile time. This allows putting stores for all vertices into the same VMEM clause. Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>	2024-12-24 12:02:20 +00:00
Marek Olšák	5003465c42	radeonsi: eliminate shader code computing killed Z/S/samplemask PS outputs Compile a monolithic optimized shader to do that, and clean up the comments. Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>	2024-12-24 12:02:20 +00:00
Marek Olšák	58132d6fc8	radeonsi: implement nir_opt_frag_depth using kill_z instead of the NIR pass This uses si_shader_info to store whether gl_FragDepth can be removed, and it uses the kill_z epilog flag to do the removal without recompilation. Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>	2024-12-24 12:02:20 +00:00
Marek Olšák	b56f47611a	radeonsi: fix alpha-to-coverage + alpha-to-one used together for gfx6-10.3 It works exactly like gfx11 except that COVERAGE_TO_MASK_ENABLE must be 1 to indicate that alpha for alpha-to-coverage should be read from mrtz.a. Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>	2024-12-24 12:02:20 +00:00
Marek Olšák	08abddd235	radeonsi/gfx11: fix alpha-to-coverage + alpha-to-one used together alpha-to-coverage must be applied before alpha-to-one. The only way to do that is to export alpha for alpha-to-coverage via mrtz, and export 1 via mrt0.a. ACO and monolithic shader support is already in place thanks to RADV, so we only need to change the LLVM PS epilog and the shader key. Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>	2024-12-24 12:02:20 +00:00
Marek Olšák	de996ac481	radeonsi: kill Z and stencil PS outputs if depth or stencil is disabled This adds kill_z and kill_stencil flags to the shader PS epilog key, which removes those outputs if depth or stencil are disabled. It must be implemented in: * ACO PS epilog * LLVM PS epilog * ac_nir_lower_ps for monolithic shaders Some of the samplemask code wasn't completely correct, but probably harmless. Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>	2024-12-24 12:02:20 +00:00
Marek Olšák	823e9e846e	radeonsi: switch to the new TCS LDS/offchip size computation The new TCS LDS size should be less than what it was before. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31673>	2024-12-18 11:07:59 +00:00
Marek Olšák	85c20def94	ac,radv,radeonsi: enable TCS input reads from VGPRs for all compatible loads Cross-invocation TCS input access doesn't prevent same-invocation access. This improves shaders that use both for the same inputs. Also, if some components of a vec4 slot only use same-invocation access and other components only use cross-invocation access (it's possible after compaction), this takes the VGPR path for the components with same-invocation access, which didn't happen previously because all masks only describe whole vec4s. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31673>	2024-12-18 11:07:59 +00:00
Alyssa Rosenzweig	41076b2a55	radeonsi: use mesa_prim_has_adjacency Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32529>	2024-12-12 21:16:13 +00:00
Marek Olšák	680f7afe0b	radeonsi: don't use nir_io_dont_optimize because it's deprecated There is a new environment variable that can be used instead. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32257>	2024-11-20 21:08:30 +00:00
Marek Olšák	06292538ae	radeonsi: add helper si_shader_culling_enabled it will contain more logic Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32257>	2024-11-20 21:08:29 +00:00
Marek Olšák	d7415d3717	radeonsi: clean up and rename gfx10_edgeflags_have_effect Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32257>	2024-11-20 21:08:29 +00:00
Marek Olšák	6988967a1f	radeonsi: rewrite/replace gfx10_ngg_get_vertices_per_prim Reuse si_get_input_prim (which is similar) and split it into 2 functions: - si_get_output_prim_simplified - si_get_num_vertices_per_output_prim Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32257>	2024-11-20 21:08:29 +00:00
Marek Olšák	963a84677e	radeonsi: optionally return MESA_PRIM_UNKNOWN from si_get_input_prim it will be used later Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32257>	2024-11-20 21:08:29 +00:00
Marek Olšák	691a9ccb33	radeonsi: prepare for making SI_NGG_CULL_TRIANGLES/LINES VS only, rename them They will have no effect on TES and GS, so this will make it more obvious. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32257>	2024-11-20 21:08:29 +00:00
Marek Olšák	51aa1d8381	radeonsi: fix gl_FrontFace elimination when one side is culled Fixes: `55d81214c9` - radeonsi: replace gl_FrontFacing with a constant if one side is always culled Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32186>	2024-11-19 11:43:20 +00:00
Marek Olšák	5be9d76861	radeonsi: fix an assertion failure in si_shader_ps with AMD_DEBUG=mono assert(!shader->key.ps.part.prolog.force_persp_center_interp \|\| (!G_0286CC_PERSP_SAMPLE_ENA(input_ena) && !G_0286CC_PERSP_CENTROID_ENA(input_ena))); failed when all FS inputs have been eliminated by optimizations, which causes LLVM to set PERSP_SAMPLE_ENA because at least 1 of those must be enabled, which this code didn't expect. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32186>	2024-11-19 11:43:20 +00:00
Marek Olšák	8deb32ac2e	radeonsi: split outputs_written_before_tes_gs into ls_es_* and tcs_* masks these will have different values later Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32171>	2024-11-16 21:59:29 -05:00
Marek Olšák	40d9616bd3	radeonsi: don't pad esgs_vertex_stride if it's 0 so that we don't allocate any LDS for ES->GS varyings if it's unused. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31291>	2024-09-20 19:49:44 +00:00

1 2 3 4 5 ...

290 commits