fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-20 04:48:07 +02:00

Author	SHA1	Message	Date
Rhys Perry	a93bd52f4f	nir/lower_int64: allow 64-bit comparisons when lowering minmax RADV doesn't need these to be lowered. fossil-db (navi31): Totals from 1 (0.00% of 79242) affected shaders: Instrs: 28 -> 26 (-7.14%) CodeSize: 140 -> 128 (-8.57%) Latency: 605 -> 604 (-0.17%) Copies: 5 -> 6 (+20.00%) VALU: 14 -> 13 (-7.14%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27335>	2024-03-06 15:23:18 +00:00
Rhys Perry	b37804c8de	nir/algebraic: optimize 64-bit comparisons with zero'd halves to 32-bit These expect nir_lower_int64 to replace u2u64 to pack_64_2x32_split(, 0). fossil-db (navi31): Totals from 149 (0.19% of 79242) affected shaders: Instrs: 433095 -> 431830 (-0.29%); split: -0.29%, +0.00% CodeSize: 2165980 -> 2160284 (-0.26%); split: -0.27%, +0.00% SpillSGPRs: 689 -> 688 (-0.15%) Latency: 3801497 -> 3799901 (-0.04%); split: -0.05%, +0.01% InvThroughput: 1547916 -> 1546567 (-0.09%); split: -0.09%, +0.01% VClause: 4698 -> 4693 (-0.11%) SClause: 9981 -> 9977 (-0.04%); split: -0.05%, +0.01% Copies: 66148 -> 65431 (-1.08%); split: -1.09%, +0.01% PreSGPRs: 6732 -> 6729 (-0.04%) PreVGPRs: 7976 -> 7945 (-0.39%) VALU: 252936 -> 252336 (-0.24%) SALU: 51794 -> 51274 (-1.00%); split: -1.03%, +0.02% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27335>	2024-03-06 15:23:18 +00:00
Rhys Perry	417eb390c6	nir/algebraic: remove duplicated iand(ien, ine)/ior(ieq, ieq) patterns These don't seem useful, since they're already done in the early optimizations. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27335>	2024-03-06 15:23:18 +00:00
Rhys Perry	6952bb359c	nir/algebraic: don't create 64-bit min/max/ior if lowered fossil-db (navi31): Totals from 58 (0.07% of 79242) affected shaders: Instrs: 11692 -> 11304 (-3.32%) CodeSize: 65836 -> 62412 (-5.20%) VGPRs: 1320 -> 1344 (+1.82%) Latency: 51712 -> 50234 (-2.86%) InvThroughput: 10190 -> 10160 (-0.29%) Copies: 460 -> 688 (+49.57%) VALU: 6130 -> 5897 (-3.80%) SALU: 1231 -> 1284 (+4.31%); split: -0.32%, +4.63% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27335>	2024-03-06 15:23:18 +00:00
Georg Lehmann	1d8b2b159e	nir/divergence_analysis: fix subgroup mask These depend on the subgroup invocation id, so they are divergent. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Fixes: `df86c5ffb3` ("nir: add divergence analysis pass.") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27962>	2024-03-05 14:52:17 +00:00
Georg Lehmann	230743da2e	nir: remove rotate scope All other subgroup operations do not have a scope in NIR, so for consistency rotate shouldn't have one either. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27964>	2024-03-05 14:12:21 +00:00
Timothy Arceri	edf242f825	nir: add some nir_parameter fields These will be used in future to do more validation on functions as the glsl nir linker is expanded. The first use is in the following patch. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27841>	2024-03-04 05:40:55 +00:00
Timothy Arceri	7afce96b80	glsl_to_nir: merge function param handling Here we remove the special handling for input params that was hard to work with and unite it with the output and inout params. Here a mediump test needs to be updated to what is a more expected outcome anyway. We also need to update the code that inserts software f64 to the new way input params are handled. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27108>	2024-03-04 11:31:20 +11:00
Job Noorman	96c2fe3e1a	nir: add search helper is_only_used_by_if Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27411>	2024-03-01 13:45:11 +00:00
Lionel Landwerlin	259cdc5496	nir: add additional flag to resource_intel for embedded samplers This will enable specific lowering of embedded samplers. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22151>	2024-02-29 07:05:06 +00:00
Faith Ekstrand	f4fb5277c3	nir: Add an imad opcode Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27159>	2024-02-27 21:51:30 -06:00
Ian Romanick	a2292f53b5	nir: Optimize uniform vote_all and vote_any No shader-db changes on any Intel platform. fossil-db: All Ice Lake and newer platforms had similar results. (Ice Lake) Totals: Instrs: 165513303 -> 165511820 (-0.00%) Cycles: 15125314947 -> 15125211500 (-0.00%); split: -0.00%, +0.00% Totals from 82 (0.01% of 656120) affected shaders: Instrs: 544627 -> 543144 (-0.27%) Cycles: 22616493 -> 22513046 (-0.46%); split: -0.46%, +0.00% No fossil-db changes on Gfx9. Suggested-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27044>	2024-02-27 09:44:32 -08:00
Ian Romanick	535caaf3e0	nir: Optimize uniform iadd, fadd, and ixor reduction operations This adds optimizations for iadd, fadd, and ixor with reduce, inclusive scan, and exclusive scan. NOTE: The fadd and ixor optimizations had no shader-db or fossil-db changes on any Intel platform. NOTE 2: This change "fixes" arb_compute_variable_group_size-local-size and base-local-size.shader_test on DG2 and MTL. This is just changing the code path taken to not use whatever path was not working properly before. This is a subset of the things optimized by ACO. See also https://gitlab.freedesktop.org/mesa/mesa/-/issues/3731#note_682802. The min, max, iand, and ior exclusive_scan optimizations are not implemented. Broadwell on shader-db is not happy. I have not investigated. v2: Silence some warnings about discarding const. v3: Rename mbcnt to count_active_invocations. Add a big comment explaining the differences between the two paths. Suggested by Rhys. shader-db: All Gfx9 and newer platforms had similar results. (Ice Lake shown) total instructions in shared programs: 20300384 -> 20299545 (<.01%) instructions in affected programs: 19167 -> 18328 (-4.38%) helped: 35 / HURT: 0 total cycles in shared programs: 842809750 -> 842766381 (<.01%) cycles in affected programs: 2160249 -> 2116880 (-2.01%) helped: 33 / HURT: 2 total spills in shared programs: 4632 -> 4626 (-0.13%) spills in affected programs: 206 -> 200 (-2.91%) helped: 3 / HURT: 0 total fills in shared programs: 5594 -> 5581 (-0.23%) fills in affected programs: 664 -> 651 (-1.96%) helped: 3 / HURT: 1 fossil-db results: All Intel platforms had similar results. (Ice Lake shown) Totals: Instrs: 165551893 -> 165513303 (-0.02%) Cycles: 15132539132 -> 15125314947 (-0.05%); split: -0.05%, +0.00% Spill count: 45258 -> 45204 (-0.12%) Fill count: 74286 -> 74157 (-0.17%) Scratch Memory Size: 2467840 -> 2451456 (-0.66%) Totals from 712 (0.11% of 656120) affected shaders: Instrs: 598931 -> 560341 (-6.44%) Cycles: 184650167 -> 177425982 (-3.91%); split: -3.95%, +0.04% Spill count: 983 -> 929 (-5.49%) Fill count: 2274 -> 2145 (-5.67%) Scratch Memory Size: 52224 -> 35840 (-31.37%) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27044>	2024-02-27 09:44:11 -08:00
Ian Romanick	f10d1ef372	nir: Initial framework for optimizing uniform subgroup operations The first commit just optimizes operation where the result of the subgroup operation is the same as each of the individual channel results. This is a subset of the things optimized by ACO. See also https://gitlab.freedesktop.org/mesa/mesa/-/issues/3731#note_682802. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27044>	2024-02-27 08:38:31 -08:00
Ian Romanick	75de4458a1	nir: Mark nir_intrinsic_load_global_block_intel as divergent This is divergent because it specifically loads sequential values into successive SIMD lanes. No shader-db or fossil-db changes on any Intel platform. Fixes: `9f44a26462` ("nir/divergence: handle load_global_block_intel") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27044>	2024-02-27 08:36:42 -08:00
Ian Romanick	5da5106727	nir: Add documentation for subgroup_.._mask v2: Fix reference to GL_ARB_shader_ballot. Noticed by Lionel. Suggested-by: Lionel Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27044>	2024-02-27 08:36:09 -08:00
Sagar Ghuge	30ead72e80	nir: Allow nir_texop_tg4 in implicit derivative This allow us to invoke the quad helper. v2: (Georg) - Add check for is_gather_implicit_lod Fixes: `48158636bf` ("nir: add is_gather_implicit_lod") Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27447>	2024-02-27 00:22:46 +00:00
Alyssa Rosenzweig	6825902bb6	treewide: use ralloc_memdup @@ expression memctx, dst, src, size; @@ -dst = ralloc_size(memctx, size); -memcpy(dst, src, size); +dst = ralloc_memdup(memctx, src, size); Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27762>	2024-02-26 15:37:58 +00:00
Timur Kristóf	cc1501628f	nir: Clean up divergence analysis for TES patch input loads. Just make the code a little bit easier to follow. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27680>	2024-02-26 14:53:23 +00:00
Timur Kristóf	870a2e4197	nir: Cleanup divergence analysis for mesh shaders. 1. Mesh shaders don't have inputs (only task payload), so remove them from handling load_input. 2. Clarify in comments that loading any mesh shader output is an NV_mesh_shader only feature. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27680>	2024-02-26 14:53:23 +00:00
Timur Kristóf	9553d67373	nir: Fix divergence analysis of load_patch_vertices_in. load_patch_vertices_in can only occur in tessellation shaders, and contains the number of vertices in an input patch. * TCS: patch_vertices_in is equal to the input patch size * TES: patch_vertices_in is equal to the TCS output patch size The patch sizes may be set by a pipeline or dynamic states, however in both cases it is definitely uniform within a subgroup. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27680>	2024-02-26 14:53:23 +00:00
Timur Kristóf	537c0029dd	nir: Fix divergence of reductions. By accident, the function would return without setting the divergence information. Cc: mesa-stable Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27680>	2024-02-26 14:53:23 +00:00
Timothy Arceri	ec240e2cd8	nir: allow gather info to handle nir_deref_type_array_wildcard Needed for some changes to the glsl nir linker in the following patches. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27669>	2024-02-20 23:29:17 +00:00
Bas Nieuwenhuizen	c7b2ac3377	radv: Remove ray_launch_size_addr_amd system value. Not used anymore, so clean it up. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27664>	2024-02-17 11:08:16 +00:00
Caio Oliveira	a88084f8be	intel/compiler: Rename brw_image_param to isl_image_param And move them to ISL. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27475>	2024-02-14 22:31:23 -08:00
Alyssa Rosenzweig	cb0b027c59	asahi: make clip_halfz dynamic we could move this to the linker but meh, this is good enough for now Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27616>	2024-02-14 21:02:32 +00:00
Alyssa Rosenzweig	6673924b7e	asahi: make gs topology dynamic even with shobjs, we know the class of topology statically, so we just need to select between the (up to) 3 compatible topologies, and luckily there are common subexpressions we can factor out when calculating all 3 at once. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27616>	2024-02-14 21:02:32 +00:00
Alyssa Rosenzweig	17896f1699	nir: rm load_vert_id_in_prim_agx now unused since we separate vs/gs Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27616>	2024-02-14 21:02:31 +00:00
Alyssa Rosenzweig	c6c8262ce1	asahi: implement pipeline stats as a checkbox real impl is blocked on uapi to plumb thru hw perf counters. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27616>	2024-02-14 21:02:30 +00:00
Asahi Lina	b89da92a5e	agx: compiler: Add fence_helper_exit_agx barrier This is used by the helper program on exit. Signed-off-by: Asahi Lina <lina@asahilina.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27616>	2024-02-14 21:02:29 +00:00
Asahi Lina	b07dbf7b0f	nir: Add AGX-specific helper opcodes These opcodes are used by the helper program to fetch the current operation info and core ID. Signed-off-by: Asahi Lina <lina@asahilina.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27616>	2024-02-14 21:02:29 +00:00
Alyssa Rosenzweig	311070f7af	nir: add active_subgroup_invocation_agx sysval Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27616>	2024-02-14 21:02:29 +00:00
Alyssa Rosenzweig	5dc0f5ccba	asahi: implement VBO robustness GL semantics. GLES (weaker) and VK (stronger) semantics are left as a todo, with explanations given. Enabled always to deal with null VBOs, this should be optimized once we have soft fault. This necessitates a rework of VBO keys, but hopefully for the best. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27616>	2024-02-14 21:02:29 +00:00
Alyssa Rosenzweig	9753cd44f7	asahi: Implement skeleton for tessellation This implements a rough skeleton of what's needed for tessellation. It contains the relevant lowerings to merge the VS and TCS, running them as a compute kernel, and to lower the TES to a new VS (possibly merged in with a subsequent GS). This is sufficient for both standalone tessellation and tess + geom/xfb together. It does not yet contain a GPU accellerated tessellator, simply falling back to the CPU for that for now. Nevertheless the data structures are engineered with that end goal in mind, in particular to be able to tessellate all patches in parallel without needing any prefix sums etc (using simple watermark allocation for the heap). Work on fleshing out the skeleton continues in parallel. For now, this does pass the tests and lets the harder stuff get regression tested more easily. And merging early will ease rebase. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27616>	2024-02-14 21:02:28 +00:00
Alyssa Rosenzweig	2d37d1b704	asahi: lower poly stipple Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27616>	2024-02-14 21:02:28 +00:00
Mike Blumenkrantz	9e2c7314f2	nir/lower_io: fix handling for compact arrays with indirect derefs this logic relies on constant indexing for compact arrays, but this is frequently not the case for compact array builtins (e.g., gl_TessLevelOuter). the usual strategy of lowering to temps isn't viable in TCS, which means io lowering has to be able to handle indirect access to these builtins without crashing cc: mesa-stable Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27534>	2024-02-13 16:13:13 +00:00
Karol Herbst	727cddd338	nir/lower_cl_images: record image_buffers and msaa_images Cc: mesa-stable Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27385>	2024-02-13 10:12:13 +00:00
Connor Abbott	6a744ddebc	ir3: Initial support for pushing globals with ldg.k Add a separate pass which uses the analyze_ubo_ranges machinery to construct ranges of readonly globals accessed in the shader and push them to constants in the preamble, using ldg.k if possible. This is enough to handle inline uniforms in turnip but also provides a base for OpenCL, although the pass would need further work for that. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26934>	2024-02-12 22:05:13 +00:00
Connor Abbott	45c71803f9	tu: Add more info to ldg inline uniform path This will let us push the ldg into the preamble. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26934>	2024-02-12 22:05:13 +00:00
Sagar Ghuge	c984d6e2fc	nir: Drop intel specific lowering code In previous patches, we have moved the Intel specific lowering code in brw_nir_lower_texture file. We can go ahead and drop the Intel specific texture source too. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27458>	2024-02-12 21:25:48 +00:00
Alyssa Rosenzweig	9c006d5dce	nir/passthrough_gs: flesh out gs_in_prim geometry shaders don't specify the input topology, only the class of topology. normalize when generating a passthrough gs. asahi will be more picky about this in the future. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Antonino Maniscalco <antonino.maniscalco@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27457>	2024-02-09 11:53:31 +00:00
Marek Olšák	e98bbcad17	nir: add vertex divergence into nir_divergence_analysis This is a prerequisite for the new nir_opt_varyings pass. It reuses the same divergent field in nir_def and nir_loop. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26918>	2024-02-02 16:45:52 -05:00
Marek Olšák	5ffa4d879c	nir: add a lower_mediump_io callback into options This will be called by the GLSL linker before nir_opt_varyings. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26918>	2024-02-02 16:45:51 -05:00
Marek Olšák	ecf0fe09f0	nir: replace lower_io_variables with a GLSL NIR flag This stops using it in nir_lower_io_passes because all callers call it only when it's true. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26918>	2024-02-02 16:45:49 -05:00
Marek Olšák	c4acab77a8	nir: remove and replace underused option pack_varying_options This will also be used by nir_opt_varyings. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26918>	2024-02-02 16:45:47 -05:00
Marek Olšák	c844b5dc85	nir: relax validation failure for generic TCS outputs with no_varying Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26918>	2024-02-02 16:45:45 -05:00
Job Noorman	60413e11c2	ir3: optimize subgroup operations using brcst.active Follow the blob and optimize subgroup operation using brcst.active and getlast when supported. The transformation consists of two parts. First, a NIR transform replaces subgroup operations with a sequence of new brcst_active_ir3 intrinsics followed by a new [type]_clusters_ir3 intrinsic (where type can be reduce, inclusive_scan, or exclusive_scan). The brcst_active_ir3 intrinsic is lowered directly to a brcst.active instruction. The other intrinsics get lowered to a new macro (OPC_SCAN_CLUSTERS_MACRO) which later gets emitted as a loop (using getlast/getone) that iterates all clusters and produces the requested scan result. OPC_SCAN_CLUSTERS_MACRO has a number of optional arguments. First, since the exclusive scan result is not a natural by-product of the loop but has to be calculated explicitly, its destination is optional. This is necessary since adding it unconditionally will produce unused instructions that won't be DCE'd anymore at this point. Second, when performing 32b MUL_U reductions (that expand to multiple instructions), an extra scratch register is necessary. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6387 Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26950>	2024-02-02 19:49:22 +00:00
Konstantin Seurer	c925b6019d	radv/rt: Lower ray payloads like hit attribs Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27051>	2024-02-02 16:36:15 +00:00
Ian Romanick	c8ba2bc2f0	nir: Pack texture LOD and array index to a single 32-bit value v2: Fix clamped_ai calculation in nir_lower_tex.c. Add nir_tex_src_combined_lod_and_array_index_intel to print_tex_instr. Suggested by Sagar. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27305>	2024-02-02 02:39:10 +00:00
Konstantin Seurer	e3c2dc2324	nir/print: Rename workgroup-size to workgroup_size Every other field uses _ instead of -. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27318>	2024-01-30 21:19:40 +00:00

1 2 3 4 5 ...

5157 commits