fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-18 22:28:06 +02:00

Author	SHA1	Message	Date
Georg Lehmann	3e6e1e213c	nir: remove fall_equal/fany_nequal opcodes Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40197>	2026-03-04 19:50:27 +00:00
Georg Lehmann	609c46cf23	nir/lower_alu_width: emit f2f32 for unpack_half_2x16 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39511>	2026-02-06 06:12:36 +00:00
Georg Lehmann	a706769a0b	nir: move exact bit to nir_fp_math_control Unifies nir per instruction float control. In the future this can be split into contract/reassoc/transform like SPIR-V. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (except SPIR-V) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39103>	2026-01-07 09:40:57 +00:00
Georg Lehmann	f3290219ab	nir: use a seperate enum for per alu floating point math control We don't need one bit per bitsize per instruction if only one actually matters in the end. First step towards moving NIR in the direction of full float_controls2 only. Also rename this from fp_fast_math, because that name implied that 0 is the no fast math mode, while the opposite was the case. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39026>	2025-12-29 10:57:05 +00:00
Simon Perretta	6dd0a5ee2d	pvr, pco: switch to clc query shaders Signed-off-by: Simon Perretta <simon.perretta@imgtec.com> Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37439>	2025-09-22 14:52:04 +01:00
Simon Perretta	6edb72d28b	pco: replace {un,}packing alu ops with intrinsics Signed-off-by: Simon Perretta <simon.perretta@imgtec.com> Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36412>	2025-09-16 18:26:19 +00:00
Simon Perretta	8104ef4e01	pco: support 1010102 snorm, [us]scaled formats Signed-off-by: Simon Perretta <simon.perretta@imgtec.com> Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36412>	2025-09-16 18:26:19 +00:00
Simon Perretta	b50f0b47d2	pco: add support for sscaled8* formats Signed-off-by: Simon Perretta <simon.perretta@imgtec.com> Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36412>	2025-09-16 18:26:09 +00:00
Simon Perretta	db686e190a	pvr, pco: per frag/vertex input/output rework Adds support for packing and unpacking r10g10b10a2 unorm and r11g11b10 float formats, as well as partial 2x16 and 4x8 formats. Signed-off-by: Simon Perretta <simon.perretta@imgtec.com> Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36412>	2025-09-16 18:26:09 +00:00
Simon Perretta	b7c0863b97	pco: add uadd64_32 op Signed-off-by: Simon Perretta <simon.perretta@imgtec.com> Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36412>	2025-09-16 18:26:08 +00:00
Antonio Ospite	ddf2aa3a4d	build: avoid redefining unreachable() which is standard in C23 In the C23 standard unreachable() is now a predefined function-like macro in <stddef.h> See https://android.googlesource.com/platform/bionic/+/HEAD/docs/c23.md#is-now-a-predefined-function_like-macro-in And this causes build errors when building for C23: ----------------------------------------------------------------------- In file included from ../src/util/log.h:30, from ../src/util/log.c:30: ../src/util/macros.h:123:9: warning: "unreachable" redefined 123 \| #define unreachable(str) \ \| ^~~~~~~~~~~ In file included from ../src/util/macros.h:31: /usr/lib/gcc/x86_64-linux-gnu/14/include/stddef.h:456:9: note: this is the location of the previous definition 456 \| #define unreachable() (__builtin_unreachable ()) \| ^~~~~~~~~~~ ----------------------------------------------------------------------- So don't redefine it with the same name, but use the name UNREACHABLE() to also signify it's a macro. Using a different name also makes sense because the behavior of the macro was extending the one of __builtin_unreachable() anyway, and it also had a different signature, accepting one argument, compared to the standard unreachable() with no arguments. This change improves the chances of building mesa with the C23 standard, which for instance is the default in recent AOSP versions. All the instances of the macro, including the definition, were updated with the following command line: git grep -l '[^_]unreachable(' -- "src/**" \| sort \| uniq \| \ while read file; \ do \ sed -e 's/$[^_]$unreachable(/\1UNREACHABLE(/g' -i "$file"; \ done && \ sed -e 's/#undef unreachable/#undef UNREACHABLE/g' -i src/intel/isl/isl_aux_info.c Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36437>	2025-07-31 17:49:42 +00:00
Georg Lehmann	ba63263f32	nir: add bfdot2_bfadd and use it for lowering bfdot if supported Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34768>	2025-05-09 11:20:26 +00:00
Caio Oliveira	cf4021f93c	nir: Add opcodes for BFloat16 SPV_KHR_bfloat16 requires a small set of operations, since it doesn't support all the arithmetic ops. This patch adds conversions to/from Float32 and also the necessary ops (bfdot, bffma, bfmul) to implement SpvOpDot using the same lowering approach than the Float32 counterpart. Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>	2025-04-29 16:29:36 +00:00
Alyssa Rosenzweig	9b07550908	treewide: use nir_shader_alu_pass @def@ typedef bool; typedef nir_builder; typedef nir_instr; typedef nir_def; identifier fn, instr, intr, x, builder, data; @@ static fn(nir_builder* builder, -nir_instr instr, +nir_alu_instr intr, ...) { ( - if (instr->type != nir_instr_type_alu) - return false; - nir_alu_instr intr = nir_instr_as_alu(instr); \| - nir_alu_instr intr = nir_instr_as_alu(instr); - if (instr->type != nir_instr_type_alu) - return false; ) <... ( -instr->x +intr->instr.x \| -instr +&intr->instr ) ...> } @pass depends on def@ identifier def.fn; expression shader, progress; @@ ( -nir_shader_instructions_pass(shader, fn, +nir_shader_alu_pass(shader, fn, ...) \| -NIR_PASS_V(shader, nir_shader_instructions_pass, fn, +NIR_PASS_V(shader, nir_shader_alu_pass, fn, ...) \| -NIR_PASS(progress, shader, nir_shader_instructions_pass, fn, +NIR_PASS(progress, shader, nir_shader_alu_pass, fn, ...) ) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30582>	2024-08-10 13:40:21 +00:00
Alyssa Rosenzweig	15257b65c6	treewide: use nir_metadata_control_flow Via Coccinelle patch: @@ @@ -nir_metadata_block_index \| nir_metadata_dominance +nir_metadata_control_flow ...plus some manual fixups for call sites missed by coccinelle. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Acked-by: Karol Herbst <kherbst@redhat.com> Acked-by: Juan A. Suarez Romero <jasuarez@igalia.com> [broadcom] Acked-by: Vasily Khoruzhick <anarsoul@gmail.com> [lima] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29745>	2024-06-17 16:28:14 -04:00
Georg Lehmann	f66883a875	nir: lower pack_uvec4_to_uint to pack_32_4x8 if supported Foz-DB Navi31: Totals from 42 (0.05% of 79395) affected shaders: Instrs: 3326544 -> 3324640 (-0.06%) CodeSize: 16908376 -> 16896212 (-0.07%); split: -0.07%, +0.00% VGPRs: 4284 -> 4296 (+0.28%) Latency: 17862544 -> 17855438 (-0.04%); split: -0.05%, +0.01% InvThroughput: 3535291 -> 3533993 (-0.04%); split: -0.04%, +0.00% VClause: 95270 -> 95275 (+0.01%); split: -0.01%, +0.01% SClause: 65402 -> 65397 (-0.01%) Copies: 229723 -> 234124 (+1.92%) Branches: 109481 -> 109518 (+0.03%); split: -0.00%, +0.04% PreVGPRs: 3879 -> 3909 (+0.77%) VALU: 1789208 -> 1787370 (-0.10%); split: -0.10%, +0.00% SALU: 409136 -> 409129 (-0.00%); split: -0.00%, +0.00% Tested-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28882>	2024-06-04 17:00:29 +00:00
Georg Lehmann	dcab408a6c	nir: remove unpack_half_flush_to_zero It doesn't make sense to have two sets of opcodes for this when all backends that support the flush_to_zero variant just rely on the global floating point mode anyway. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29433>	2024-05-31 09:46:35 +00:00
Iván Briano	666647acae	nir: track some float controls bits per instruction With float_controls2, shaders can decide on the behavior of NaN/Inf/SignedZero preservation by decorating specific instructions, on top of having a default for the whole program. Add where to track these to nir_alu_instr and propagate them to new instructions everywhere that exact is being done already. v2: use less bits for fp_fast_math in nir_alu_instr (Alyssa) Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27281>	2024-04-25 12:13:41 +00:00
Rhys Perry	08903bbe89	nir: add mqsad_4x8, shfr and nir_opt_mqsad Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26251>	2024-04-05 11:01:39 +00:00
Karol Herbst	807ff7ed01	nir: add nir_lower_alu_vec8_16_srcs pass This pass is useful for vector based backends as we might end up with alu instructions referencing vec8/vec16 values even though being vec4 or smaller themselves. This new pass intents to clean up any use of vec8/vec16 sources other passes won't. Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25330>	2023-09-27 11:54:13 +00:00
Timothy Arceri	84e0f5ce75	nir: remove unused param from nir_alu_src_copy() Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24986>	2023-09-08 03:01:39 +00:00
Faith Ekstrand	6c1d32581a	nir: Drop nir_alu_dest Instead, we replace it directly with nir_def. We could replace it with nir_dest but the next commit gets rid of that so this avoids unnecessary churn. Most of this commit was generated by sed: sed -i -e 's/dest.dest.ssa/def/g' src/*/.h src/*/.c src/*/.cpp There were a few manual fixups required in the nir_legacy.c and nir_from_ssa.c as nir_legacy_reg and nir_parallel_copy_entry both have a similar pattern. Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24674>	2023-08-14 21:22:53 +00:00
Faith Ekstrand	ed9affa02f	nir: Drop most instances of nir_ssa_dest_init() Generated using the following two semantic patches: @@ expression I, J, NC, BS; @@ -nir_ssa_dest_init(I, &J->dest, NC, BS); +nir_def_init(I, &J->dest.ssa, NC, BS); @@ expression I, J, NC, BS; @@ -nir_ssa_dest_init(I, &J->dest.dest, NC, BS); +nir_def_init(I, &J->dest.dest.ssa, NC, BS); Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24658>	2023-08-13 17:12:52 +00:00
Alyssa Rosenzweig	09d31922de	nir: Drop "SSA" from NIR language Everything is SSA now. sed -e 's/nir_ssa_def/nir_def/g' \ -e 's/nir_ssa_undef/nir_undef/g' \ -e 's/nir_ssa_scalar/nir_scalar/g' \ -e 's/nir_src_rewrite_ssa/nir_src_rewrite/g' \ -e 's/nir_gather_ssa_types/nir_gather_types/g' \ -i $(git grep -l nir \| grep -v relnotes) git mv src/compiler/nir/nir_gather_ssa_types.c \ src/compiler/nir/nir_gather_types.c ninja -C build/ clang-format cd src/compiler/nir && find .c .h -type f -exec clang-format -i \{} \; Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Acked-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24585>	2023-08-12 16:44:41 -04:00
Faith Ekstrand	777d336b1f	nir: clang-format src/compiler/nir/*.[ch] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24382>	2023-08-12 19:27:28 +00:00
Alyssa Rosenzweig	42ee8a55dd	nir: Remove nir_alu_dest::write_mask Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432>	2023-08-03 22:40:30 +00:00
Alyssa Rosenzweig	5fead24365	treewide: Drop is_ssa asserts We only see SSA now. Via Coccinelle patch: @@ expression x; @@ -assert(x.is_ssa); Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432>	2023-08-03 22:40:28 +00:00
Alyssa Rosenzweig	d559764e7c	nir: Remove nir_alu_dest::saturate Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432>	2023-08-03 22:40:28 +00:00
Rhys Perry	25c49e491f	aco,ac/llvm,ac/nir,vtn: unify cube opcodes fossil-db (navi21): Totals from 17068 (12.79% of 133461) affected shaders: Instrs: 24743703 -> 24743572 (-0.00%); split: -0.00%, +0.00% CodeSize: 132579952 -> 132580620 (+0.00%); split: -0.00%, +0.00% VGPRs: 1227840 -> 1227984 (+0.01%) Latency: 403180114 -> 403251188 (+0.02%); split: -0.00%, +0.02% InvThroughput: 75311302 -> 75320892 (+0.01%); split: -0.00%, +0.01% VClause: 415400 -> 415402 (+0.00%); split: -0.00%, +0.00% Copies: 1715404 -> 1715258 (-0.01%); split: -0.01%, +0.01% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Gert Wollny <gert.wollny@collabora.com> (r600) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23930>	2023-06-30 15:35:03 +00:00
Alyssa Rosenzweig	01e9ee79f7	nir: Drop unused name from nir_ssa_dest_init Since `624e799cc3` ("nir: Drop nir_ssa_def::name and nir_register::name"), SSA defs don't have names, making the name argument unused. Drop it from the signature and fix the call sites. This was done with the help of the following Coccinelle semantic patch: @@ expression A, B, C, D, E; @@ -nir_ssa_dest_init(A, B, C, D, E); +nir_ssa_dest_init(A, B, C, D); Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23078>	2023-05-17 23:46:16 +00:00
Rhys Perry	50f7e21481	nir: make fdph lowering match fdot Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20812>	2023-03-08 14:38:26 +00:00
Rhys Perry	3668da7c83	nir: use xyzw order for precise fdot Fixes flickering grass in Immortals Fenyx Rising. fossil-db (gfx1100): Totals from 13969 (10.38% of 134574) affected shaders: MaxWaves: 442794 -> 442878 (+0.02%) Instrs: 4861105 -> 4901408 (+0.83%); split: -0.02%, +0.85% CodeSize: 24316100 -> 24396272 (+0.33%); split: -0.03%, +0.35% VGPRs: 446256 -> 445572 (-0.15%); split: -0.20%, +0.05% Latency: 28122456 -> 28162233 (+0.14%); split: -0.10%, +0.24% InvThroughput: 2899673 -> 2904323 (+0.16%); split: -0.07%, +0.23% VClause: 119599 -> 119631 (+0.03%); split: -0.07%, +0.09% SClause: 186636 -> 186265 (-0.20%); split: -0.23%, +0.03% Copies: 301370 -> 300386 (-0.33%); split: -0.75%, +0.42% Branches: 85066 -> 85047 (-0.02%); split: -0.02%, +0.00% PreSGPRs: 436167 -> 436137 (-0.01%) PreVGPRs: 329715 -> 329809 (+0.03%); split: -0.01%, +0.04% fossil-db (gfx1100, RADV_DEBUG=invariantgeom): Totals from 43116 (32.04% of 134574) affected shaders: MaxWaves: 1332938 -> 1333012 (+0.01%); split: +0.01%, -0.00% Instrs: 16424513 -> 16658021 (+1.42%); split: -0.06%, +1.48% CodeSize: 81258868 -> 81827860 (+0.70%); split: -0.07%, +0.77% VGPRs: 1720368 -> 1719648 (-0.04%); split: -0.19%, +0.15% SpillSGPRs: 1670 -> 1600 (-4.19%); split: -5.27%, +1.08% Latency: 82063766 -> 82425418 (+0.44%); split: -0.23%, +0.67% InvThroughput: 9665803 -> 9727810 (+0.64%); split: -0.09%, +0.73% VClause: 449662 -> 451099 (+0.32%); split: -0.32%, +0.64% SClause: 498841 -> 498639 (-0.04%); split: -0.24%, +0.20% Copies: 1001020 -> 1000770 (-0.02%); split: -1.20%, +1.17% Branches: 237580 -> 239637 (+0.87%); split: -0.01%, +0.88% PreSGPRs: 1198167 -> 1198024 (-0.01%); split: -0.01%, +0.00% PreVGPRs: 1225202 -> 1225035 (-0.01%); split: -0.06%, +0.05% fossil-db (navi10): Totals from 13969 (10.38% of 134563) affected shaders: MaxWaves: 474386 -> 474508 (+0.03%); split: +0.05%, -0.03% Instrs: 3740895 -> 3771566 (+0.82%); split: -0.00%, +0.82% CodeSize: 19426592 -> 19459916 (+0.17%); split: -0.00%, +0.18% VGPRs: 389916 -> 389852 (-0.02%); split: -0.09%, +0.07% Latency: 25452927 -> 25502482 (+0.19%); split: -0.14%, +0.34% InvThroughput: 3880807 -> 3923144 (+1.09%); split: -0.07%, +1.16% VClause: 66835 -> 66712 (-0.18%); split: -0.38%, +0.20% SClause: 178805 -> 178802 (-0.00%); split: -0.01%, +0.01% Copies: 167601 -> 167625 (+0.01%); split: -0.54%, +0.56% Branches: 83788 -> 83784 (-0.00%) PreSGPRs: 388229 -> 388216 (-0.00%) PreVGPRs: 342984 -> 343062 (+0.02%); split: -0.01%, +0.03% fossil-db (navi10, RADV_DEBUG=invariantgeom): Totals from 43116 (32.04% of 134563) affected shaders: MaxWaves: 1260184 -> 1256414 (-0.30%); split: +0.10%, -0.40% Instrs: 12804951 -> 12983628 (+1.40%); split: -0.01%, +1.41% CodeSize: 65813224 -> 66137852 (+0.49%); split: -0.03%, +0.52% VGPRs: 1556396 -> 1561340 (+0.32%); split: -0.09%, +0.41% SpillSGPRs: 1377 -> 1395 (+1.31%) Latency: 76095867 -> 76355111 (+0.34%); split: -0.32%, +0.66% InvThroughput: 13546863 -> 13788789 (+1.79%); split: -0.05%, +1.84% VClause: 310910 -> 311283 (+0.12%); split: -0.63%, +0.75% SClause: 474878 -> 474941 (+0.01%); split: -0.09%, +0.10% Copies: 639367 -> 637610 (-0.27%); split: -1.03%, +0.76% Branches: 240178 -> 240185 (+0.00%); split: -0.00%, +0.00% PreSGPRs: 1056594 -> 1056590 (-0.00%); split: -0.00%, +0.00% PreVGPRs: 1247950 -> 1247798 (-0.01%); split: -0.05%, +0.04% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7920 Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20812>	2023-03-08 14:38:26 +00:00
Marek Olšák	b80bd58265	nir: skip nir_op_unpack_32_4x8 in nir_lower_alu_width The pass can't handle it just like the other unpack opcodes and generates invalid NIR. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19399>	2023-03-03 03:27:40 +00:00
Rhys Perry	aa2d6e020b	Revert "nir: Drop the unused instr arg for src/dest copy functions." This reverts commit `c3a0184118`. Acked-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12910>	2022-08-30 18:21:44 +00:00
Daniel Schürmann	be01e8711b	nir: introduce new nir_alu_alu_width() with nir_vectorize_cb callback This function allows to only scalarize instructions down to a desired vectorization width. nir_lower_alu_to_scalar() was changed to use the new function with a width of 1. Swizzles outside vectorization width are considered and reduce the target width. This prevents ending up with code like vec2 16 ssa_2 = iadd ssa_0.xz, ssa_1.xz which requires to emit shuffle code in backends and usually is not beneficial. Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13080>	2022-06-01 11:41:44 +00:00

35 commits