fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-01-31 15:40:24 +01:00

Author	SHA1	Message	Date
Georg Lehmann	ba63263f32	nir: add bfdot2_bfadd and use it for lowering bfdot if supported Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34768>	2025-05-09 11:20:26 +00:00
Caio Oliveira	cf4021f93c	nir: Add opcodes for BFloat16 SPV_KHR_bfloat16 requires a small set of operations, since it doesn't support all the arithmetic ops. This patch adds conversions to/from Float32 and also the necessary ops (bfdot, bffma, bfmul) to implement SpvOpDot using the same lowering approach than the Float32 counterpart. Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>	2025-04-29 16:29:36 +00:00
Benjamin Lee	252c59602e	panfrost: implement 16-bit ldexp Bifrost LDEXP.v2f16 takes a 16-bit exponent, which requires messy lowering. The codegen for this is quite bad currently, but would be improved by implementing unpack_32_2x16_split_*, and by fusing comparisons with CSEL. The main alternative is converting to F32, then LDEXP.f32, then converting back to F16. This has better codegen for dynamic exponents currently, but worse in the common case with a constant exponent where all the saturating cast logic can be folded. Fixes dEQP-VK.glsl.builtin.precision_fp16_storage16b.ldexp.compute.vec2 when shaderFloat16 is enabled in panvk. Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Acked-by: Rebecca Mckeever <rebecca.mckeever@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33637>	2025-02-27 16:49:11 +00:00
Mel Henning	11b8c8b8e6	nak,nir: Add 64-bit lea_nv Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32517>	2025-02-13 17:36:41 +00:00
Mel Henning	0470643047	nak,nir: Add 32-bit nir_op_lea_nv and use it Changes code size by -0.80% on shaderdb. Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32517>	2025-02-13 17:36:41 +00:00
Alyssa Rosenzweig	bd89279dd4	nir: add lower_scratch_to_var pass to ease opencl pain. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32529>	2024-12-12 21:16:13 +00:00
Karmjit Mahil	b79994e92d	nir,ir3: Add icsel_eqz In IR3 `sel.b32` works based on the 0 so add `icsel_eqz` to fuse the cmp and sel that we'd otherwise need. total Instruction Count in shared programs: 1112814 -> 1110473 (-0.21%) Instruction Count in affected programs: 162701 -> 160360 (-1.44%) helped: 81 HURT: 29 Instruction count are helped. total MOV Count in shared programs: 86777 -> 88671 (2.18%) MOV Count in affected programs: 28119 -> 30013 (6.74%) helped: 1 HURT: 292 Mov count are HURT. total COV Count in shared programs: 15070 -> 14962 (-0.72%) COV Count in affected programs: 5770 -> 5662 (-1.87%) helped: 76 HURT: 2 Cov count are helped. total Last helper instruction in shared programs: 592729 -> 590638 (-0.35%) Last helper instruction in affected programs: 91331 -> 89240 (-2.29%) helped: 30 HURT: 1 Last helper instruction are helped. total Instructions with SS sync bit in shared programs: 29336 -> 29546 (0.72%) Instructions with SS sync bit in affected programs: 4702 -> 4912 (4.47%) helped: 8 HURT: 43 Instructions with ss sync bit are HURT. total Estimated cycles stalled on SS in shared programs: 111590 -> 112401 (0.73%) Estimated cycles stalled on SS in affected programs: 27708 -> 28519 (2.93%) helped: 21 HURT: 61 Estimated cycles stalled on ss are HURT. total cat1 instructions in shared programs: 101933 -> 103695 (1.73%) cat1 instructions in affected programs: 35804 -> 37566 (4.92%) helped: 18 HURT: 290 Cat1 instructions are HURT. total cat2 instructions in shared programs: 380299 -> 377499 (-0.74%) cat2 instructions in affected programs: 128609 -> 125809 (-2.18%) helped: 322 HURT: 0 Cat2 instructions are helped. Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32189>	2024-12-06 08:42:36 +00:00
Job Noorman	22fc90a116	nir: add ir3-specific bitwise triop opcodes ir3 has a number of bitwise triops (e.g., shrm == (src0 >> src1) & src2) that don't have NIR-equivalents. Doing instruction selection for them is a lot more convenient using algebraic patterns than to have to manually match for them. This patch add NIR opcodes for these instructions. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Rob Clark <robclark@freedesktop.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32181>	2024-11-28 06:19:59 +00:00
Rhys Perry	0619e4db63	nir,aco,ac/llvm: add nir_op_alignbyte_amd Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31904>	2024-11-13 12:59:26 +00:00
Alyssa Rosenzweig	9ab8d70fa6	nir: add ilea_agx/ulea_agx opcodes to facilitate address mode lowering. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>	2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig	85b3dc90e0	nir,agx: lower fmin/fmax in NIR we want to elide flushes, doing so requires more sophisticated analysis than I'd like in the middle of isel. also, it should be done before forming preambles for efficiency (notice the uniform reduction here). let's do it with a NIR pass. total instructions in shared programs: 2768481 -> 2757832 (-0.38%) instructions in affected programs: 644084 -> 633435 (-1.65%) helped: 2242 HURT: 18 helped stats (abs) min: 1 max: 349 x̄: 4.77 x̃: 3 helped stats (rel) min: 0.01% max: 34.91% x̄: 3.19% x̃: 2.19% HURT stats (abs) min: 1 max: 19 x̄: 2.89 x̃: 1 HURT stats (rel) min: 0.24% max: 7.94% x̄: 1.27% x̃: 0.81% 95% mean confidence interval for instructions value: -5.20 -4.22 95% mean confidence interval for instructions %-change: -3.30% -3.01% Instructions are helped. total alu in shared programs: 2182880 -> 2172352 (-0.48%) alu in affected programs: 513166 -> 502638 (-2.05%) helped: 2235 HURT: 16 helped stats (abs) min: 1 max: 349 x̄: 4.73 x̃: 3 helped stats (rel) min: 0.02% max: 37.65% x̄: 3.70% x̃: 2.59% HURT stats (abs) min: 1 max: 19 x̄: 2.50 x̃: 1 HURT stats (rel) min: 0.33% max: 3.74% x̄: 1.04% x̃: 0.91% 95% mean confidence interval for alu value: -5.16 -4.20 95% mean confidence interval for alu %-change: -3.83% -3.49% Alu are helped. total fscib in shared programs: 2178643 -> 2168059 (-0.49%) fscib in affected programs: 514666 -> 504082 (-2.06%) helped: 2243 HURT: 17 helped stats (abs) min: 1 max: 349 x̄: 4.74 x̃: 3 helped stats (rel) min: 0.02% max: 37.65% x̄: 3.74% x̃: 2.59% HURT stats (abs) min: 1 max: 19 x̄: 2.65 x̃: 1 HURT stats (rel) min: 0.33% max: 14.71% x̄: 1.85% x̃: 0.93% 95% mean confidence interval for fscib value: -5.16 -4.20 95% mean confidence interval for fscib %-change: -3.87% -3.53% Fscib are helped. total bytes in shared programs: 18467348 -> 18403042 (-0.35%) bytes in affected programs: 4403648 -> 4339342 (-1.46%) helped: 2247 HURT: 20 helped stats (abs) min: 2 max: 2132 x̄: 28.73 x̃: 18 helped stats (rel) min: 0.01% max: 33.53% x̄: 2.80% x̃: 1.94% HURT stats (abs) min: 4 max: 72 x̄: 12.60 x̃: 6 HURT stats (rel) min: 0.23% max: 6.58% x̄: 1.06% x̃: 0.75% 95% mean confidence interval for bytes value: -31.29 -25.45 95% mean confidence interval for bytes %-change: -2.90% -2.64% Bytes are helped. total regs in shared programs: 864605 -> 864442 (-0.02%) regs in affected programs: 4692 -> 4529 (-3.47%) helped: 68 HURT: 48 helped stats (abs) min: 1 max: 54 x̄: 7.25 x̃: 3 helped stats (rel) min: 4.26% max: 43.20% x̄: 13.21% x̃: 10.53% HURT stats (abs) min: 1 max: 36 x̄: 6.88 x̃: 6 HURT stats (rel) min: 3.64% max: 91.67% x̄: 23.12% x̃: 24.00% 95% mean confidence interval for regs value: -3.60 0.79 95% mean confidence interval for regs %-change: -2.10% 5.75% Inconclusive result (value mean confidence interval includes 0). total uniforms in shared programs: 2120927 -> 2120911 (<.01%) uniforms in affected programs: 770 -> 754 (-2.08%) helped: 6 HURT: 0 helped stats (abs) min: 2 max: 4 x̄: 2.67 x̃: 2 helped stats (rel) min: 1.79% max: 2.70% x̄: 2.13% x̃: 1.96% 95% mean confidence interval for uniforms value: -3.75 -1.58 95% mean confidence interval for uniforms %-change: -2.50% -1.76% Uniforms are helped. total threads in shared programs: 27612224 -> 27613056 (<.01%) threads in affected programs: 7168 -> 8000 (11.61%) helped: 6 HURT: 3 helped stats (abs) min: 64 max: 192 x̄: 170.67 x̃: 192 helped stats (rel) min: 8.33% max: 23.08% x̄: 20.62% x̃: 23.08% HURT stats (abs) min: 64 max: 64 x̄: 64.00 x̃: 64 HURT stats (rel) min: 8.33% max: 9.09% x̄: 8.59% x̃: 8.33% 95% mean confidence interval for threads value: -3.17 188.06 95% mean confidence interval for threads %-change: -0.92% 22.69% Inconclusive result (value mean confidence interval includes 0). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31908>	2024-10-30 10:14:07 -04:00
Rhys Perry	b2abd3bdba	nir: fix shfr constant folding with zero src2 No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Fixes: `08903bbe89` ("nir: add mqsad_4x8, shfr and nir_opt_mqsad") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31808>	2024-10-25 09:59:40 +00:00
Georg Lehmann	dbf63a0788	nir: remove nir_op_is_derivative Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31014>	2024-10-17 09:50:19 +00:00
Georg Lehmann	f9d2aad7a3	nir: remove alu ddx/ddy Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31014>	2024-10-17 09:50:19 +00:00
Alyssa Rosenzweig	6287c8251d	nir: add bounds_agx opcode used to facilitate bounds checking optimization Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31532>	2024-10-05 18:30:11 +00:00
Iago Toral Quiroga	aac1c074cc	nir: make fclamp_pos_mali and fsat_signed_mali opcodes generic V3D can use these too. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31480>	2024-10-03 09:02:07 +00:00
Alyssa Rosenzweig	749205fe06	pan/bi: switch to derivative intrinsics rewrote most of the impl but shrug. regresses code gen for mediump but I'm not too bothered given the lackluster perf of fp16 on bifrost :( Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30567>	2024-08-14 01:34:54 +00:00
Faith Ekstrand	bbccbd8d50	nir,nak: Add a nir_op_prmt_nv We have this in hardware since forever and it's really useful. May as well add it to NIR so we can use it in various lowerings. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30218>	2024-07-17 13:38:24 +00:00
Alyssa Rosenzweig	6f48fa4ebe	nir: strengthen fmin/fmax definitions with signed zero SPIR-V strengthened the semantics around signed zero, requiring fmin(-0, +0) = -0. Since nir_op_fmin is commutative, we must also require fmin(+0, -0) = -0 to match, although it's unclear if SPIR-V requires that. We must strengthen NIR's definitions accordingly. This strengthening is additionally motivated by the existing nir_opt_algebraic rule like: (('fmin', a, ('fneg', a)), ('fneg', ('fabs', a))), With the strengthened new definition, this transform is clearly exact. With the weaker definition, the transform could change the sign of zero based on implementation-defined behaviours which ... while, not exactly unsound, is undesireable semantically. ... This is probably technically a bug fix, but I'm not convinced it's worth it's weight in backporting. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30075>	2024-07-15 19:29:00 +00:00
Alyssa Rosenzweig	7fc5a2296b	nir: use MIN2/MAX2 opcodes for imin/umax folding This is more idiomatic and already #include'd. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30075>	2024-07-15 19:29:00 +00:00
Georg Lehmann	99372c1ed7	nir: add ford, funord, fneo, fequ, fltu, fgeu Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29467>	2024-06-27 08:12:29 +00:00
Juan A. Suarez Romero	60e7cb7654	nir: use unsigned types when performing bitshifting Ensure unsigned integers are used instead of signed ones when performing left bit shifts. This has been detected by the Undefined Behaviour Sanitizer (UBSan). Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29772>	2024-06-21 21:07:05 +00:00
Juan A. Suarez Romero	e43cc49806	nir: fix overflow when negating maxint in constant expressions Undefined Behaviour Sanitizer (UBSan) detected the following when running testing `dEQP-VK.graphicsfuzz.cov-fold-negate-min-int-value`: `negation of -2147483648 cannot be represented in type 'int'; cast to an unsigned type to negate this value to itself` SPIR-V spec states that OpSNegate(0x80000000) has to return 0x80000000; in our case, -2147483648 should be -2147483648. While this is not causing any issue because compilers seem to be behaving like that, it is still undefined behaviour, so it expects to be this handled explicitly, which is the purpose of this commit. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29772>	2024-06-21 21:07:05 +00:00
Georg Lehmann	dcab408a6c	nir: remove unpack_half_flush_to_zero It doesn't make sense to have two sets of opcodes for this when all backends that support the flush_to_zero variant just rely on the global floating point mode anyway. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29433>	2024-05-31 09:46:35 +00:00
Connor Abbott	32308fe9f1	ir3/nir: Fix imadsh_mix16 definition The constant-folding definition and comments say that it takes the high 16 bits of the first source and low 16 bits of the second source, but actually it's the opposite. The algebraic optimization, which actually happens and needs to be correct, was correct but the comment above it was wrong. Note that in the way we use it when lowering multiplications, the ordering doesn't matter. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22075>	2024-04-26 12:55:14 +00:00
Rhys Perry	08903bbe89	nir: add mqsad_4x8, shfr and nir_opt_mqsad Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26251>	2024-04-05 11:01:39 +00:00
Faith Ekstrand	f4fb5277c3	nir: Add an imad opcode Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27159>	2024-02-27 21:51:30 -06:00
Rhys Perry	ae54cbeb3f	nir: remove sad_u8x4 All uses of this can be replaced with msad_4x8. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26907>	2024-01-05 18:55:22 +00:00
Rhys Perry	0477421f7d	nir: add msad_4x8 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26907>	2024-01-05 18:55:22 +00:00
Alejandro Piñeiro	c0cfa4f53b	nir: add new opcodes to map new v71 packing/conversion instructions Since v71, broadcom hw include specific packing/conversion instructions, so this commit adds opcodes to be able to make use of them, specially for image stores: * pack_2x16_to_unorm_2x8 (on backend vftounorm8/vftosnorm8): 2x16-bit floating point to 2x8-bit unorm/snorm * f2unorm_16/f2snorm_16 (on backend ftounorm16/ftosnorm16): floating point to 16-bit unorm/snorm * pack_2x16_to_unorm_2x10/pack_2x16_to_unorm_10_2 (on backend vftounorm10lo/vftounorm10hi): used to convert a floating point to a r10g10b10a2 unorm * pack_32_to_r11g11b10 (on backend v11fpack): packs 2 2x16 FP into R11G11B10. * pack_uint_32_to_r10g10b10a2 (on backend v10pack): pack 2 2x16 integer into R10G10B10A2 * pack_4x16_to_4x8 (on backend v8pack): packs 2 2x16 bit integer into 4x8 bits. * pack_2x32_to_2x16 (on backend vpack): 2x32 bit to 2x16 integer pack For the latter, it can be easly confused with the existing pack_32_2x16_split. But note that this one receives two 16bit integer, and packs them on a 32bit integer. But broadcom opcode takes two 32bit integer, takes the lower halfword, and packs them as 2x16 on a 32bit integer. Interestingly broadcom also defines a similar one that packs the higher halfword. Not used yet. Note that at this point we use agnostic names, even if we add a _v3d suffix as they are only available for broadcom, in order to follow current NIR conventions. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25726>	2023-11-20 08:20:31 +00:00
Marek Olšák	f3886e9c02	nir: split FLOAT_CONTROLS_SIGNED_ZERO_INF_NAN_PRESERVE_FP* flags GLSL doesn't preserve NaNs, but it optionally preserves Infs. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25392>	2023-10-17 17:27:12 +00:00
Iván Briano	987749430d	nir: round f2f16{_rtne/_rtz} correctly for constant expressions As noted in the previous commit, the intermediate cast to float from double can produce wrong results. Fixes upcoming Vulkan CTS tests: dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp16.input_args.rounding_rte_sconst_conv_from_fp64_up dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp16.input_args.rounding_rte_sconst_conv_from_fp64_up_nostorage dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp16.input_args.rounding_rte_sconst_conv_from_fp64_up_vert dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp16.input_args.rounding_rte_sconst_conv_from_fp64_up_nostorage_vert dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp16.input_args.rounding_rte_sconst_conv_from_fp64_up_frag dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp16.input_args.rounding_rte_sconst_conv_from_fp64_up_nostorage_frag Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25281>	2023-10-09 23:37:52 +00:00
Alyssa Rosenzweig	b77dc9f7d7	nir: Add NIR_OP_IS_DERIVATIVE property Like IS_SELECTION. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24833>	2023-09-18 08:38:15 -04:00
Marek Olšák	ee225e31c1	nir: fix constant evaluation of fddx/fddy sourcing Inf & NaN constant A derivative of NaN is NaN. Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24792>	2023-08-19 14:18:52 -04:00
Alyssa Rosenzweig	d1f6bcd1d0	nir: Add b32fcsel_mdg opcode for Midgard Midgard has both int and float version of b32csel. The backend needs some way to pick between the two, and it's a lot more convenient to choose in NIR before going out-of-SSA than in the backend. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Italo Nicola <italonicola@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23769>	2023-06-30 16:29:35 -04:00
Rhys Perry	25c49e491f	aco,ac/llvm,ac/nir,vtn: unify cube opcodes fossil-db (navi21): Totals from 17068 (12.79% of 133461) affected shaders: Instrs: 24743703 -> 24743572 (-0.00%); split: -0.00%, +0.00% CodeSize: 132579952 -> 132580620 (+0.00%); split: -0.00%, +0.00% VGPRs: 1227840 -> 1227984 (+0.01%) Latency: 403180114 -> 403251188 (+0.02%); split: -0.00%, +0.02% InvThroughput: 75311302 -> 75320892 (+0.01%); split: -0.00%, +0.01% VClause: 415400 -> 415402 (+0.00%); split: -0.00%, +0.00% Copies: 1715404 -> 1715258 (-0.01%); split: -0.01%, +0.01% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Gert Wollny <gert.wollny@collabora.com> (r600) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23930>	2023-06-30 15:35:03 +00:00
Georg Lehmann	481a34e82e	nir: add single bit test opcodes These directly map to amd's SALU s_bitcmp0/1. For VALU we can use v_cmp_class_f32 if the second source is constant. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23298>	2023-06-29 13:39:30 +00:00
Jesse Natalie	663d957480	nir: Fix constant expression for unpack_64_4x16 Cc: Mesa-stable Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>	2023-06-13 00:43:36 +00:00
Alyssa Rosenzweig	5c1d614256	nir: Add interleave_agx instruction While this is a generic bit twiddling ALU instruction, it's especially useful for address calculations, since the architecture's tiled textures use Morton coding within the tiles. This will be used when lowering image_texel_address on AGX, as part of the image atomics implementation. I don't know if there's any other neat uses I could detect with opt_algebraic, this doesn't seem like an operation a shader would open-code... Maybe useful for BVH building or something... Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23513>	2023-06-12 20:09:53 +00:00
Rhys Perry	1ba73621bc	nir,vtn,aco,ac/llvm: make cube_face_coord_amd more direct Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22636>	2023-05-25 16:29:16 +00:00
Alyssa Rosenzweig	bd466195b9	nir: Make ALU descriptions machine-readable We already document a lot of ALU opcodes, let's make this machine-readable so we can put the descriptions in our generated HTML documentation. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22929>	2023-05-12 12:11:38 +00:00
Alyssa Rosenzweig	6b4f00a3ac	nir: Allow adding descriptions to ALU opcodes This will let us generate nicer documentation. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22929>	2023-05-12 12:11:38 +00:00
Alyssa Rosenzweig	18e19882fa	nir: Model AGX-specific multiply-shift-add Models `(a * b) + (c << d)` in general, as implemented in various forms on AGX. This will be fused with backend NIR opt algebraic rules, both for the literal pattern as well as to strength reduce certain multiplications, e.g. replacing a * 5 with `a + (a << 2)` expressed as imadshl_agx(a, 1, a, 2). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Asahi Lina <lina@asahilina.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22695>	2023-05-11 09:23:09 -04:00
Erik Faye-Lund	955797d015	nir: fix constant-folding of 64-bit fpow We need to do full pow if 64-bit, and we can do fpow() otherwise. Not the other way around. Fixes: `9076c4e289` ("nir: update opcode definitions for different bit sizes") Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22774>	2023-05-02 07:53:51 +00:00
Qiang Yu	6848e05f9c	nir: pack_(s\|u)norm_2x16 support float16 as input For AMD GPU which has instruction to normalize and pack two float16 inputs, and used when fragment shader export color output. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21552>	2023-03-28 19:57:11 +00:00
Ian Romanick	66840b98e4	nir: ifind_msb_rev can only have int32 sources Just like ifind_msb. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042>	2023-03-10 15:27:17 +00:00
Hampus Linander	4ffc7c3ff4	nir: Add extr_agx opcode The AGX extr instruction extracts a bitfield from two 32bit registers. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20628>	2023-02-04 11:13:24 -05:00
Ian Romanick	ea413e826b	nir: Eliminate nir_op_f2b Builds on the work of !15121. This gets to delete even more code because many drivers shared a lot of code for i2b and f2b. No shader-db or fossil-db changes on any Intel platform. v2: Rebase on `1a35acd8d9`. v3: Update a comment in nir_opcodes_c.py. Suggested by Konstantin. v4: Another rebase. Remove f2b stuff from Midgard. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20509>	2023-02-03 22:39:57 +00:00
Timur Kristóf	12652cc549	nir: Add pack_half_2x16_rtz_split opcode. Same as pack_half_2x16_rtz_split, but always uses RTZ mode. Note that pack_half_2x16 rounding mode is unspecified. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15838>	2023-01-26 12:24:24 +00:00
Alyssa Rosenzweig	a49ba0f1ae	nir: Add Midgard-specific fsin/fcos ops NIR has a fsin instruction that takes an argument in radians. Midgard instead has an fsinpi argument that takes an argument in multiples of pi. So, we had a NIR pass that would change fsin(x) to fsin(x / pi) and then map fsin to fsinpi in the backend. But that's invalid! In NIR, the opcode fsin is well-defined. fsin(x) means something very different than fsin(x / pi). They won't usually be equal. The transform fsin(x) -> fsin(x / pi) is fundamentally unsound. It did work before, by accident. Most NIR passes don't care about the semantics of ALU instructions. fsin(x) and fsin(x / pi) are both well-defined but fundamentally different NIR shaders. So while rewriting is wrong -- the NIR we get out is not equivalent to the NIR we put in, and the Midgard ops we generate are not equivalent to the NIR -- but if we don't run any passes that care about the definition of fsin the two wrongs will cancel out to make a right. However, some NIR passes do care about the definitions of ALU instructions, instead of treating them as named black boxes. In particular, constant folding (nir_opt_constant_fold) evaluates ALU instructions when their inputs are constants, according to the definition in nir_opcodes.py. So our little charade will only work if we don't call nir_opt_constant_fold, or if all the fsin instructions have non-constant inputs. At the beginning of this series, that is the case. With the later scalarization change, that's no longer the case, and the unsoundness translates to real failing tests rather than a quibble of NIR's semantics. To mitigate, we define a new NIR opcode with the semantics we want and translate fsin(x) = fsin_mdg(x / pi), where that equivalence does hold mathematically. So the new translation is sound and doesn't rely on lucky pass ordering. This matches the approach already used for AMD and AGX, which have fsin_amd and fsin_agx opcodes respectively. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Italo Nicola <italonicola@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19350>	2023-01-16 22:20:43 +00:00

1 2 3 4

200 commits