fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-06 07:18:17 +02:00

Author	SHA1	Message	Date
Sagar Ghuge	b2cffdb1ed	anv: Implement write_buffer_cp callback Rework: (Kevin) - Fix pointer arithmatic calculation. - Add assertion for a 4B alignment Co-authored-by: Kevin Chuang <kaiwenjon23@gmail.com> Co-authored-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31588>	2024-12-04 10:41:45 +00:00
Sagar Ghuge	8817ff26fc	anv: Move update buffer code in helper Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31588>	2024-12-04 10:41:45 +00:00
Sagar Ghuge	0edf208ab9	anv: Implement cmd_fill_buffer_addr callback Co-authored-by: Kevin Chuang <kaiwenjon23@gmail.com> Co-authored-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31588>	2024-12-04 10:41:45 +00:00
Kevin Chuang	2fe57947e3	anv: Implement encode shader to fit in ANV BVH This shader gets called and will construct ANV BVH from IR BVH. More specifically, each invocation will take care of one internal node. The internal nodes get processed starting from root node all the way to the bottom leaves. During processing, we keep track of the destination of where the internal node should be encoded (tracked in vk_ir_box.bvh_offset), and where its leaves should be encoded (tracked in vk_ir_header.dst_node_offset). The processed bvh is in contiguous memory, which starts with header, followed by interleaving internal nodes and leaves. The nodes information are also populated. Rework: (Sagar) - Return out of bounds threads early - Mimic GRL internal node encoding - Handle node mask - Fix block_incr_and_start_prim - Fix shader_index_and_geom_mask for instance node - Fix instance flag - Fix block_incr and instance_contribution_and_geom_flags initialized to be zero - Fix lower_x and upper_x to be properly flipped for invalid child - For invalid node, clear blockIncr and set startPrim to INVALID - Calculated things upfront and assign, cutting down more than ~200 instructions Co-authored-by: Kevin Chuang <kaiwenjon23@gmail.com> Co-authored-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31588>	2024-12-04 10:41:45 +00:00
Sagar Ghuge	692b5fa9f2	anv: Add shader to copy acceleration structures Rework (Kevin) - encode the address of anv_instance_leaf after header in order to handle serialization and deserialization part. - draw serialized data layout and explanation Co-authored-by: Kevin Chuang <kaiwenjon23@gmail.com> Co-authored-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31588>	2024-12-04 10:41:45 +00:00
Sagar Ghuge	a6b1a1fce1	anv: Add shader to build BVH header Rework: (Kevin) - Calculate the compacted_size properly - Update instance count and self pointer - The alignment of serialization size is not needed Co-authored-by: Kevin Chuang <kaiwenjon23@gmail.com> Co-authored-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31588>	2024-12-04 10:41:45 +00:00
Sagar Ghuge	ef94b7097b	anv: Add header to track BVH data structures This commit adds build interface and helper header for ANV BVH. Rework: (Kevin) - Use block_size macro to represent bvh node/leaf size - Rename BVH-related node/leaf size macros for clarity - Updated comments for some fields for bvh and nodes. - move bvh_layout to anv_bvh.h - Draw anv_bvh layout - rename child_offset to child_block_offset Co-authored-by: Kevin Chuang <kaiwenjon23@gmail.com> Co-authored-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31588>	2024-12-04 10:41:45 +00:00
Sagar Ghuge	617b7602ea	anv: Split GRL code path in separate file Rework (Kevin) - Remove genX_acceleration_structure.c from meson option to avoid linking error Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31588>	2024-12-04 10:41:44 +00:00
Sagar Ghuge	b002b2589c	anv: Update include dir for anv_tests Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31588>	2024-12-04 10:41:44 +00:00
Samuel Pitoiset	5d072e0e73	radv: fix stencil only copies of depth/stencil images with SDMA This was broken for two reasons: - the number of bytes per element should be 1 (8-bit for stencil) - the base offset should be adjusted for the stencil Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32459>	2024-12-04 09:30:36 +00:00
Lionel Landwerlin	69edf4144a	brw: use transpose unspill messages when possible This simplifies the unspill messages quite a bit. A/B testing on DG2 : BlackOps3 : +0.96% TotalWarPharaoh: +0.31% DG2 shader changes : Assassin's Creed Valhalla: Totals from 19 (0.89% of 2131) affected shaders: Instrs: 70542 -> 64369 (-8.75%) Cycle count: 18810945 -> 18560169 (-1.33%); split: -1.40%, +0.06% Black Ops 3: Totals from 55 (3.41% of 1612) affected shaders: Instrs: 389549 -> 350646 (-9.99%) Cycle count: 344168275 -> 340652311 (-1.02%); split: -1.17%, +0.15% Control: Totals from 1 (0.11% of 878) affected shaders: Instrs: 3409 -> 3212 (-5.78%) Cycle count: 255991 -> 250411 (-2.18%) Cyberpunk 2077: Totals from 1 (0.08% of 1264) affected shaders: Instrs: 2363 -> 2337 (-1.10%) Cycle count: 69283 -> 69186 (-0.14%) Fallout 4: Totals from 1 (0.06% of 1601) affected shaders: Instrs: 27946 -> 20056 (-28.23%) Cycle count: 2391398 -> 2153658 (-9.94%) Fortnite: Totals from 273 (3.65% of 7470) affected shaders: Instrs: 634377 -> 601519 (-5.18%) Cycle count: 31870433 -> 31624089 (-0.77%); split: -0.78%, +0.01% Hogwarts Legacy: Totals from 50 (3.02% of 1656) affected shaders: Instrs: 110455 -> 103339 (-6.44%) Cycle count: 6613728 -> 6530832 (-1.25%); split: -1.28%, +0.03% Metro Exodus: Totals from 70 (0.16% of 43076) affected shaders: Instrs: 253847 -> 245321 (-3.36%) Cycle count: 13269473 -> 13209131 (-0.45%) Spill count: 1111 -> 1108 (-0.27%) Fill count: 2868 -> 2865 (-0.10%) Red Dead Redemption 2: Totals from 139 (2.38% of 5847) affected shaders: Instrs: 496551 -> 450180 (-9.34%) Cycle count: 43233944 -> 40947386 (-5.29%); split: -5.33%, +0.04% Spill count: 6322 -> 6326 (+0.06%) Fill count: 15558 -> 15568 (+0.06%) Rise Of The Tomb Raider: Totals from 1 (0.56% of 178) affected shaders: Instrs: 1682 -> 1437 (-14.57%) Cycle count: 603670 -> 586766 (-2.80%) Spiderman Remastered: Totals from 820 (11.77% of 6965) affected shaders: Instrs: 4622877 -> 3984893 (-13.80%) Cycle count: 235094963186 -> 234483925430 (-0.26%); split: -0.42%, +0.16% Spill count: 73414 -> 73581 (+0.23%); split: -0.02%, +0.25% Fill count: 215090 -> 215627 (+0.25%); split: -0.02%, +0.27% Scratch Memory Size: 3520512 -> 3528704 (+0.23%); split: -0.12%, +0.35% Some of stats show spilling changes which is telling of how our spill code is not adequate. Some of the spilled values are probably being respilled which shouldn't be the case. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32110>	2024-12-04 08:59:07 +00:00
Pavel Ondračka	dcfa8851bd	ci: bring back some i915g testing Only single g33 as part of r300 ci-tron-based farm. Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com> Reviewed-by: Eric Engestrom <eric@igalia.com> Reviewed-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32376>	2024-12-04 08:18:43 +00:00
Kenneth Graunke	2ade3ec2a9	brw: Allow SIMD32 math instructions on Xe2 There's no restriction here AFAICT - only when HF types are involved. fossil-db results on Lunar Lake: Totals: Instrs: 143665291 -> 142654109 (-0.70%) Cycle count: 22516049016 -> 22514172014 (-0.01%); split: -0.02%, +0.01% Max live registers: 49038116 -> 49017687 (-0.04%); split: -0.04%, +0.00% Totals from 117623 (21.07% of 558370) affected shaders: Instrs: 25098642 -> 24087460 (-4.03%) Cycle count: 1038884570 -> 1037007568 (-0.18%); split: -0.48%, +0.29% Max live registers: 12423219 -> 12402790 (-0.16%); split: -0.16%, +0.00% Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32471>	2024-12-04 02:42:34 +00:00
Kenneth Graunke	815236b417	brw: Fix register unit calculation in SIMD32 LOAD_PAYLOAD lowering We were wanting to check if the destination region spanned multiple registers. But we were checking against REG_SIZE, when the register size is actually REG_SIZE * reg_unit(devinfo) now. This meant that SIMD32 LOAD_PAYLOAD was always getting SIMD-split on Xe2 platforms, generating a lot of unnecessary mess for compute shaders. fossil-db results on Lunar Lake: Totals: Instrs: 146178614 -> 143291988 (-1.97%); split: -1.98%, +0.00% Subgroup size: 11089632 -> 11089376 (-0.00%); split: +0.00%, -0.00% Cycle count: 22528892444 -> 22507551650 (-0.09%); split: -0.12%, +0.03% Max live registers: 48834202 -> 48886685 (+0.11%); split: -0.09%, +0.20% Totals from 134306 (24.10% of 557327) affected shaders: Instrs: 28806335 -> 25919709 (-10.02%); split: -10.02%, +0.00% Subgroup size: 4297680 -> 4297424 (-0.01%); split: +0.00%, -0.01% Cycle count: 956867650 -> 935526856 (-2.23%); split: -2.84%, +0.61% Max live registers: 13085711 -> 13138194 (+0.40%); split: -0.33%, +0.73% Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32471>	2024-12-04 02:42:34 +00:00
Caio Oliveira	dfa4c55a4f	intel/brw: Add is_control_source for the new subgroup ops Fixes: `019770f026` ("intel/brw: Add SHADER_OPCODE_VOTE_") Fixes: `9537b62759` ("intel/brw: Add SHADER_OPCODE_REDUCE") Fixes: `0ba1159b0a` ("intel/brw: Add SHADER_OPCODE__SCAN") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32411>	2024-12-04 01:19:37 +00:00
Nanley Chery	428a970511	anv: Only consider R32 image formats as supporting atomics Only consider R32 image formats as supporting atomics because we only expose VK_FORMAT_FEATURE_2_STORAGE_IMAGE_ATOMIC_BIT for those formats. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32192>	2024-12-03 22:54:35 +00:00
Nanley Chery	122c01a496	anv: Enable more storage compression on gfx12+ On gfx12.0, allow storage compression unless the image may be used with atomics. On gfx20, use the CCS_E aux-usage for storage compression. This causes ISL to create surface states with more appropriate render compression formats. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5657 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32192>	2024-12-03 22:54:35 +00:00
Nanley Chery	01c4ea771c	anv: Enable storage accesses with modifiers on gfx12+ I tested this patch with an ACM card. It enables "Halo: The Master Chief Collection" to use the clear color modifier instead falling back to the uncompressed Tile4 modifier. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32192>	2024-12-03 22:54:35 +00:00
Nanley Chery	2dedd8dbb2	intel/isl: Fix DecompressInL3 assignment on gfx12.5 * In the ACM PRMs, the programming notes under RENDER_SURFACE_STATE::MemoryCompressionEnable state that the DecompressInL3 bit must be set for media compression. * Unlike TGL, ACM seems to handle format reinterpretation just fine without using the bit. Update the assignment accordingly. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32192>	2024-12-03 22:54:34 +00:00
Autumn Ashton	7e9ea5c1b5	radv/video: Fix bitstreamStartOffset including dstBufferOffset The bitstreamStartOffset from the VK_QUERY_TYPE_VIDEO_ENCODE_FEEDBACK_KHR query in RADV is currently relative to the dstBuffer, and not dstBuffer + dstBufferOffset like the spec states. To fix this, let's append the offset to the VA directly and not tell the encoder about the offset relative to the VA at all. The Vulkan spec states: "VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_BUFFER_OFFSET_BIT_KHR specifies that queries managed by the pool will capture the byte offset of the bitstream data written by the video encode operation to the bitstream buffer specified in VkVideoEncodeInfoKHR::dstBuffer relative to the offset specified in VkVideoEncodeInfoKHR::dstBufferOffset." The relevant part being that is is relative to dstBufferOffset and not the start of the VkBuffer. Signed-off-by: Autumn Ashton <misyl@froggi.es> Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32452>	2024-12-03 22:19:43 +00:00
Georg Lehmann	1a7ebfd2a8	radv: rework vk_property initialization Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32457>	2024-12-03 21:43:34 +00:00
Georg Lehmann	b961537a17	radv: fix reporting mesh/task/rt as supported dgc indirect stages Fixes: `8300378bf3` ("radv: advertise VK_EXT_device_generated_commands on GFX8+") Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32457>	2024-12-03 21:43:34 +00:00
Gurchetan Singh	03b527ea92	gfxstream: fix issues with VK1.4 build Fixes build after VK1.4 update. Reviewed-by: Marcin Radomski <dextero@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32446>	2024-12-03 20:35:44 +00:00
Gurchetan Singh	ade6a19f14	gfxstream: remove abort() I have no idea why it just started complaining now about this. Reviewed-by: Marcin Radomski <dextero@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32446>	2024-12-03 20:35:44 +00:00
Erik Faye-Lund	7b8f5b0881	panvk: report minmax-support for sampled formats We also need to report minmax as part of the format-features. This fixes the following CTS tests for me: - dEQP-VK.api.info.format_properties.r8_unorm - dEQP-VK.api.info.format_properties.r8_snorm - dEQP-VK.api.info.format_properties.r16_sfloat - dEQP-VK.api.info.format_properties.r32_sfloat - dEQP-VK.api.info.format_properties.d16_unorm Fixes: `1fc454673a` ("panvk: Implement VK_EXT_sampler_filter_minmax for v10") Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32445>	2024-12-03 20:16:58 +00:00
Aleksi Sapon	0812a8bccc	draw: front-face injection must check geometry shader primitive type Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32306>	2024-12-03 19:42:22 +00:00
Mary Guillemard	cdf822632a	panvk: Add a nightly job for Mali-G52 We have quite a big fraction currently and it has been proven that we are missing new failures right now. This adds a new nightly job that run a full CTS on a single VIM3. Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com> Acked-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32421>	2024-12-03 19:11:30 +00:00
Mary Guillemard	913a7b26e1	panvk: Update Mali-G52 CI baseline We seems to have new regressions that were introduced but never seen because of the massive fraction used. This adds the failures seen with a full run while trying to document some. Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com> Acked-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32421>	2024-12-03 19:11:30 +00:00
Pavel Ondračka	61d890b6db	r300/ci: update RV410 CI expectations The test was almost passing before but we can't really always get the required five decimal point tolerance with the R300/R400 hw. nir_opt_algebraic improvements in `92797c6878` shuffled the ALUs a bit and we now do a bit better. Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32458>	2024-12-03 18:48:32 +00:00
Konstantin Seurer	4ed867825a	lavapipe: Implement VK_KHR_shader_float_controls2 Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31273>	2024-12-03 16:15:25 +00:00
Konstantin Seurer	540e84bedb	gallivm: Preserve -0 and nan Some operations need additional or different code to preserve the sign of 0 or nan. Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31273>	2024-12-03 16:15:25 +00:00
Konstantin Seurer	f5db70cb24	gallivm: Add float operation behavior flags to lp_type Used to emit additional code if -0 or nan needs to be preserved. Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31273>	2024-12-03 16:15:25 +00:00
Samuel Pitoiset	9df3c9e4a1	ac/parse_ib: print VA for the SDMA CONSTANT_FILL/WRITE packets Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32456>	2024-12-03 15:29:40 +00:00
Samuel Pitoiset	31524d42a2	ac/parse_ib: fix parsing SDMA CONSTANT_FILL packet This packet only has 5 DWORDS. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32456>	2024-12-03 15:29:39 +00:00
Georg Lehmann	34a47e4b14	nir/opt_algebraic: mark a - ffract(a) as nan incorrect. Inf + fract(Inf) -> Inf + NaN -> NaN floor(Inf) -> Inf Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32393>	2024-12-03 14:42:18 +00:00
Georg Lehmann	2ee96cf514	nir/opt_algebraic: optimize d3d9 ceil No Foz-DB changes. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32393>	2024-12-03 14:42:18 +00:00
Georg Lehmann	34caed8adb	nir/opt_algebraic: optimize d3d9 ftrunc Foz-DB Navi21: Totals from 85 (0.11% of 79395) affected shaders: MaxWaves: 1972 -> 1968 (-0.20%) Instrs: 48682 -> 47067 (-3.32%) CodeSize: 255664 -> 247172 (-3.32%) VGPRs: 3752 -> 3768 (+0.43%) Latency: 154414 -> 150360 (-2.63%) InvThroughput: 37186 -> 35081 (-5.66%) VClause: 847 -> 865 (+2.13%); split: -0.24%, +2.36% SClause: 768 -> 796 (+3.65%) Copies: 2763 -> 2869 (+3.84%); split: -0.14%, +3.98% VALU: 28133 -> 26781 (-4.81%) SALU: 7182 -> 6939 (-3.38%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32393>	2024-12-03 14:42:18 +00:00
Georg Lehmann	ea4aa8e5a6	nir/opt_algebraic: optimize ffma(b2f, b2f, c) Foz-DB Navi21: Totals from 134 (0.17% of 79395) affected shaders: Instrs: 153297 -> 153326 (+0.02%); split: -0.03%, +0.05% CodeSize: 829520 -> 828444 (-0.13%); split: -0.13%, +0.00% Latency: 900489 -> 899964 (-0.06%); split: -0.07%, +0.01% InvThroughput: 267838 -> 267478 (-0.13%); split: -0.14%, +0.00% VClause: 2452 -> 2454 (+0.08%) Copies: 8331 -> 8353 (+0.26%); split: -0.25%, +0.52% PreSGPRs: 4974 -> 4964 (-0.20%) PreVGPRs: 6209 -> 6218 (+0.14%) VALU: 112317 -> 112092 (-0.20%); split: -0.21%, +0.01% SALU: 12451 -> 12694 (+1.95%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32393>	2024-12-03 14:42:18 +00:00
Marek Olšák	1f69258fb4	st/mesa: replace EmitNoIndirectInput / EmitNoIndirectOutput with NIR options Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32423>	2024-12-03 12:57:36 +00:00
Marek Olšák	7f4e36ff7d	gallium: replace PIPE_SHADER_CAP_INDIRECT_INPUT/OUTPUT_ADDR with NIR options This is a prerequisite for enabling nir_opt_varyings for all gallium drivers. nir_lower_io_passes (called by the GLSL linker) only uses NIR options to lower indirect IO access before lowering IO and calling nir_opt_varyings. Most drivers report full support for indirect IO and lower it themselves, which prevents compaction of lowered indirectly accessed varyings because nir_opt_varyings doesn't touch indirect varyings. Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> (Rb for asahi) Reviewed-by: Pavel Ondračka <pavel.ondracka@gmail.com> (for r300) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32423>	2024-12-03 12:57:36 +00:00
Yogesh Mohan Marimuthu	f930201898	ac/gpu_info: populate fw info using new fw info ioctl for userq Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29010>	2024-12-03 12:02:06 +00:00
Yogesh Mohan Marimuthu	8c91624614	winsys/amdgpu: use VM_ALWAYS_VALID for all VRAM and GTT allocations Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29010>	2024-12-03 12:02:06 +00:00
Yogesh Mohan Marimuthu	f0d31eda25	winsys/amdgpu: keep has_local_buffers true for userq In case of userqueue, kernel bo kms_handle will not hold fences for non shared bo. Non shared bo fences are taken care within mesa. Hence need to copy the data to another shared buffer for export. Keeping has_local_buffers true for userq will make non shared bo to be copied to shared bo for export in si_texture_get_handle(). Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29010>	2024-12-03 12:02:06 +00:00
Yogesh Mohan Marimuthu	8447cb563f	winsys/amdgpu: send hdp flush packet for userq Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29010>	2024-12-03 12:02:06 +00:00
Yogesh Mohan Marimuthu	45fa34284f	winsys/amdgpu: don't add fence dependency of other queues for userq In case of userq, there will be only 1 userq per process. So all the jobs for that process goes into single queue. Hence there is no need to add fence of other queues even if info num_queues is > 1. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29010>	2024-12-03 12:02:06 +00:00
Yogesh Mohan Marimuthu	93703d2d19	winsys/amdgpu: add userq cmd submission support in amdgpu_cs_submit_ib() This patch adds the job submission code for userq. An indirect buffer, in short ib, can be considered a job. The job is submitted directly to the userq ring buffer and the doorbell is rung to notify the firmware to execute the job. The packets that are submitted to execute the job is below, 1) fence wait multi packet for any dependency fence 2) hdp flush packs to flush host data path 3) indirect buffer packet 4) protected signal packet Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29010>	2024-12-03 12:02:06 +00:00
Yogesh Mohan Marimuthu	97664d9e84	winsys/amdgpu: move legacy chunk init and submission to new function Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29010>	2024-12-03 12:02:06 +00:00
Yogesh Mohan Marimuthu	afeb500498	winsys/amdgpu: move noop and ib_bytes adjustment to cs_flush Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29010>	2024-12-03 12:02:06 +00:00
Yogesh Mohan Marimuthu	6e813b99af	winsys/amdgpu: wait for vm syncobj before creating userq Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29010>	2024-12-03 12:02:06 +00:00
Yogesh Mohan Marimuthu	2a499412e5	winsys/amdgpu: pass job fences to VM ioctl In case of userq, fences are not installed in kernel kms handled. fences are handled internally in mesa. So when unmapping a buffer, fences will have to be passed by mesa to kernel so that kernel can wait on these fences to unmap the buffer. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29010>	2024-12-03 12:02:06 +00:00

1 2 3 4 5 ...

198710 commits