fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-21 15:28:18 +02:00

Author	SHA1	Message	Date
Qiang Yu	5f601361ed	ac/nir: lower access for shared and scratch memory OpenCL may load and store vec16 data, while ACO only support <=32byte. Radeonsi is going to use ac_nir_lower_mem_access_bit_sizes() for lowering these access. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32781>	2024-12-27 01:58:38 +00:00
Marek Olšák	c0e5e8f932	amd: update addrlib Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32687>	2024-12-26 21:02:21 +00:00
Marek Olšák	c6fd69bd5e	ac: remove unused code Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32780>	2024-12-26 10:12:43 +00:00
Marek Olšák	de996ac481	radeonsi: kill Z and stencil PS outputs if depth or stencil is disabled This adds kill_z and kill_stencil flags to the shader PS epilog key, which removes those outputs if depth or stencil are disabled. It must be implemented in: * ACO PS epilog * LLVM PS epilog * ac_nir_lower_ps for monolithic shaders Some of the samplemask code wasn't completely correct, but probably harmless. Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>	2024-12-24 12:02:20 +00:00
Daniel Schürmann	28a214728c	ac/lower_ngg: move readlane into break blocks in streamout code generation for gfx12/ACO This avoids unnecessary shuffle code and s_wait_loadcnt. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32743>	2024-12-21 12:32:25 +00:00
Daniel Schürmann	47227089d6	ac/lower_ngg: move break blocks after loop in streamout code generation for gfx12/ACO By inverting the break condition, the loop becomes shorter. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32743>	2024-12-21 12:32:25 +00:00
Daniel Schürmann	39dcd9dedb	ac/lower_ngg: Fix collecting buffer offsets from 4 lanes on gfx12 Also use readlane for improved performance. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32743>	2024-12-21 12:32:25 +00:00
Marek Olšák	4d8a508510	ac/nir: call nir_gather_tcs_info only once for RADV Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31673>	2024-12-18 11:07:59 +00:00
Marek Olšák	8c2f9f0665	radv: switch to the new TCS LDS/offchip size computation to use the same logic as radeonsi. This could be improved, see TODOs. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31673>	2024-12-18 11:07:59 +00:00
Marek Olšák	3056bf1cb1	ac/nir: add new helpers for computing the TCS LDS/offchip size accurately This is based on how the HS lowering passes address TCS inputs and outputs. The new LDS size is lower in some cases. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31673>	2024-12-18 11:07:59 +00:00
Marek Olšák	85c20def94	ac,radv,radeonsi: enable TCS input reads from VGPRs for all compatible loads Cross-invocation TCS input access doesn't prevent same-invocation access. This improves shaders that use both for the same inputs. Also, if some components of a vec4 slot only use same-invocation access and other components only use cross-invocation access (it's possible after compaction), this takes the VGPR path for the components with same-invocation access, which didn't happen previously because all masks only describe whole vec4s. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31673>	2024-12-18 11:07:59 +00:00
Marek Olšák	99a03dc9d5	ac/nir: allow a TCS input to be available from both VGPRs and LDS Both can be used. Cross-invocation access can read it from LDS, while same-invocation access can read it from VGPRs. The entrypoints of the passes don't allow that flexibility yet, but the logic inside the pass allows it. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31673>	2024-12-18 11:07:59 +00:00
Marek Olšák	b49eab68a8	ac/nir: use s_sendmsg(HS_TESSFACTOR) to optimize writing tess factors for gfx11 This uses the new shader message. It eliminates memory stores and latency for simple cases of tess level values. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31673>	2024-12-18 11:07:59 +00:00
Marek Olšák	f4eebb373c	ac/nir: reserve the first LDS vec4 for the HS tf0/1 group vote in TCS Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31673>	2024-12-18 11:07:59 +00:00
Marek Olšák	cdecbee922	radeonsi/gfx12: adjust HiZ/HiS logic Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32653>	2024-12-16 21:54:28 +00:00
Marek Olšák	e3cef02c24	radeonsi/gfx12: set DB_RENDER_OVERRIDE based on stencil state Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32653>	2024-12-16 21:54:28 +00:00
Marek Olšák	8328e57512	ac/surface/gfx12: enable DCC 256B compressed blocks and reorder modifiers Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32653>	2024-12-16 21:54:27 +00:00
Marek Olšák	e6345e2fd3	ac: update SPI_GRP_LAUNCH_GUARANTEE_* register values for gfx12 Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32653>	2024-12-16 21:54:27 +00:00
Marek Olšák	3943ed8199	ac/lower_ngg: improve streamout code generation for gfx12/ACO to match LLVM ACO is still not perfect: * It generates s_wait_loadcnt 0x0-0x3 when the only required wait instruction is s_wait_loadcnt 0x5. * It generates a lot of unnecessary jumps and blocks for uniform loop breaks. Only scc1 jumps are necessary to break the loop. This is 10x better than LLVM, but even ACO might consider using nir_intrinsic_ordered_add_loop_gfx12_amd for the best performance. How to print the streamout asm on any GPU: PIGLIT_PLATFORM=gbm AMD_FORCE_FAMILY=gfx12_16pipe AMD_DEBUG=vs,mono,asm,useaco ../piglit/bin/shader-io-rate vs_out_xfb Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32570>	2024-12-16 07:35:07 +00:00
Qiang Yu	b14cc34415	ac/surf: add more modifiers to gfx12 supported list OpenGL will export these modifiers for various sized textures. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32570>	2024-12-16 07:35:06 +00:00
Qiang Yu	b3a218d444	ac/surface/tests: support all block sizes We are going to add more modifiers. GFX9 has 4K DCC and non-DCC modifiers while others only have 4K non-DCC modifiers. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32570>	2024-12-16 07:35:06 +00:00
Timur Kristóf	9224b9a752	ac/nir/ngg: Add ability to store primitive ID as per-primitive. This configuration will be enabled in RADV in a subsequent commit. On GFX10.3: Do this together with the primitive export, to avoid adding extra CF, and to ensure optimal access of the export space. On GFX11: It's not an export but a memory store instruction, so always do it earlier and ensure the optimal attribute ring access pattern. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32270>	2024-12-12 18:11:45 +00:00
Samuel Pitoiset	c3a050da07	radv: fix alpha-to-coverage with alpha-to-one without MRTZ This injects a MRTZ export with only the alpha channel to select it with COVERAGE_TO_MASK_ENABLE for alpha-to-coverage. Co-Authored-by: Rhys Perry <pendingchaos02@gmail.com> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32583>	2024-12-12 10:07:25 +00:00
Tim Huang	ad75b9f1a6	amd: add GFX v11.5.3 support This enables support for GFX version 11.5.3. Signed-off-by: Tim Huang <tim.huang@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32567>	2024-12-11 19:14:34 +00:00
Samuel Pitoiset	1037830098	ac/nir: export alpha to MRTZ.a and one to MRT0.a for alpha-to-one on GFX11 When alpha-to-coverage and alpha-to-one are both enabled in the fragment shader, the alpha value should be exported through MRTZ and one to MRT0.a. Otherwise, alpha-to-one will be performed before alpha-to-coverage. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32523>	2024-12-11 10:50:31 +00:00
Rhys Perry	033e76a82a	ac/nir: have ac_nir_lower_mem_access_bit_sizes preserve >128 bit SMEM Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32408>	2024-12-09 16:56:29 +00:00
Marek Olšák	d8468d5463	amd,zink: remove options.varying_estimate_instr_cost callbacks They are a maintainenance burden since they would need changes to support more instruction types that nir_opt_varyings will be able to move between shaders, and they are almost identical to default_varying_estimate_instr_cost, so just use that. The cost threshold is adjusted for AMD because default_varying_estimate_instr_cost is slightly different. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32424>	2024-12-04 13:40:41 +00:00
Samuel Pitoiset	9df3c9e4a1	ac/parse_ib: print VA for the SDMA CONSTANT_FILL/WRITE packets Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32456>	2024-12-03 15:29:40 +00:00
Samuel Pitoiset	31524d42a2	ac/parse_ib: fix parsing SDMA CONSTANT_FILL packet This packet only has 5 DWORDS. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32456>	2024-12-03 15:29:39 +00:00
Yogesh Mohan Marimuthu	f930201898	ac/gpu_info: populate fw info using new fw info ioctl for userq Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29010>	2024-12-03 12:02:06 +00:00
Yogesh Mohan Marimuthu	8447cb563f	winsys/amdgpu: send hdp flush packet for userq Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29010>	2024-12-03 12:02:06 +00:00
Yogesh Mohan Marimuthu	94c41852bd	ac: add inherit vmid field to indirect buffer packet Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29010>	2024-12-03 12:02:06 +00:00
Yogesh Mohan Marimuthu	cda75d6497	ac: add new userq signal and wait packet id Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29010>	2024-12-03 12:02:06 +00:00
Yogesh Mohan Marimuthu	093cf74b26	ac/gpuinfo: add use_userq and AMD_USERQ variable Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29010>	2024-12-03 12:02:06 +00:00
Shashank Sharma	42d49faee5	amd: add new AMDGPU_INFO subquery for userqueue metadata This patch: - adds a new subquery (AMDGPU_INFO_UQ_FW_AREAS) in AMDGPU_INFO_IOCTL to get the size and alignment of shadow and csa objects from the kernel. This information is required for a userqueue consumer (like MESA/libdrm) to create the userqueue metadata objects properly. - also adds supporting metadata structures and a high level wrapper function (amdgpu_query_uq_metadata_info) to the query, to make it easy to use. The corresponding kernel changes for this UAPI extension can be found in amd-gfx mailing list, link: https://patchwork.freedesktop.org/patch/621390/?series=139715&rev=2 This patch adds support only for the GFX IP, and the other engines may be supported in subsequent development. This patch was reviewed in libdrm library at https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/400 Cc: Marek Olsak <marek.olsak@amd.com> Cc: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Cc: Arvind Yadav <arvind.yadav@amd.com> Reviewed-by: Marek Olsak <marek.olsak@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29010>	2024-12-03 12:02:06 +00:00
Arvind Yadav	b0a70da496	amd: Add amdgpu userqueue IOCTL functions This patch adds new IOCTL functions to support userqueue create, remove, signal and wait etc. This patch was reviewed in libdrm library at https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/392 Cc: Deucher, Alexander <alexander.deucher@amd.com> Cc: Koenig, Christian <christian.koenig@amd.com> Cc: Sharma, Shashank <shashank.sharma@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29010>	2024-12-03 12:02:06 +00:00
Yogesh Mohan Marimuthu	3981b017eb	amd: include amdgpu_drm.h from mesa instead of system for ac_fake_hw_db.h Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29010>	2024-12-03 12:02:06 +00:00
David Rosca	308bae950f	ac/surface: Add RADEON_SURF_VIDEO_REFERENCE Select supported swizzle mode for VCN DPB surfaces. Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32303>	2024-12-02 13:48:22 +00:00
Benjamin Cheng	323b59a5b5	radv/video: support event for pre-VCN4 decode queues Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32400>	2024-11-29 10:03:48 +10:00
Benjamin Cheng	152b06acd8	ac/vcn: allow sq signature package to be skipped This is preparing for radv event support on pre-VCN4 encode queues. Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32400>	2024-11-29 10:01:49 +10:00
David Rosca	489ba819b0	radeonsi/vcn: Support tiling for JPEG decode Reviewed-by: Leo Liu <leo.liu@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32301>	2024-11-28 08:52:37 +00:00
Pierre-Eric Pelloux-Prayer	272addc672	ac/nir: remove prim_stride_ret arg from ngg_build_streamout_buffer_info This is not used outside of this function, so declare it as a local variable instead. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32281>	2024-11-27 19:00:20 +00:00
Georg Lehmann	239c0124df	radv: optimize sample mask comparisons Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32327>	2024-11-26 18:44:39 +00:00
Marek Olšák	a3516dafc9	util,amd: add inlinable versions of drmIoctl/drmCommandWrite* The reason for this is to inline those calls in drivers. They are very trivial, so why not. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32067>	2024-11-26 00:16:02 -05:00
Marek Olšák	049641ca54	amd: import libdrm_amdgpu ioctl wrappers This imports 35 libdrm_amdgpu functions into Mesa. The following 15 functions are still in use: amdgpu_bo_alloc amdgpu_bo_cpu_map amdgpu_bo_cpu_unmap amdgpu_bo_export amdgpu_bo_free amdgpu_bo_import amdgpu_create_bo_from_user_mem amdgpu_device_deinitialize amdgpu_device_get_fd amdgpu_device_initialize amdgpu_get_marketing_name amdgpu_query_sw_info amdgpu_va_get_start_addr amdgpu_va_range_alloc amdgpu_va_range_free We can't import them because they make sure that we only use 1 VMID per process shared by all APIs. (except the marketing name) Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32067>	2024-11-25 21:03:41 -05:00
Timur Kristóf	8653abac09	ac/nir/ngg: Remove erroneous NUW addition from workgroup scan. This may add constant -1 so naturally it can indeed cause an unsigned wrap. Fixes: `492d8f3778` Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12204 Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32338>	2024-11-25 21:43:45 +00:00
Timur Kristóf	45c523104a	ac/nir/ngg: Implement optional primitive compaction. It's an experimental feature that we may enable later. Instead of exporting NULL primitives, perform a compaction on primitives after culling. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32290>	2024-11-25 01:56:20 +01:00
Timur Kristóf	492d8f3778	ac/nir/ngg: Workgroup scan over two bools. Implement two workgroup scans over two boolean values in parallel, so that they can be done with very minimal ALU overhead. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32290>	2024-11-25 01:56:08 +01:00
Timur Kristóf	78f77e161c	ac/nir/ngg: Pass wg_repack_result as pointer instead of returning it. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32290>	2024-11-25 01:55:30 +01:00
Timur Kristóf	73fc29b25c	ac/nir/ngg: Slightly refactor workgroup scan. No functional changes, just makes the code more readable. Use inverse_ballot instead of elect. Wrap if contents, rename if. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31973>	2024-11-22 01:01:39 +01:00

1 2 3 4 5 ...

2893 commits