fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-21 20:10:14 +01:00

Author	SHA1	Message	Date
Ian Romanick	66840b98e4	nir: ifind_msb_rev can only have int32 sources Just like ifind_msb. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042>	2023-03-10 15:27:17 +00:00
Hampus Linander	4ffc7c3ff4	nir: Add extr_agx opcode The AGX extr instruction extracts a bitfield from two 32bit registers. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20628>	2023-02-04 11:13:24 -05:00
Ian Romanick	ea413e826b	nir: Eliminate nir_op_f2b Builds on the work of !15121. This gets to delete even more code because many drivers shared a lot of code for i2b and f2b. No shader-db or fossil-db changes on any Intel platform. v2: Rebase on `1a35acd8d9`. v3: Update a comment in nir_opcodes_c.py. Suggested by Konstantin. v4: Another rebase. Remove f2b stuff from Midgard. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20509>	2023-02-03 22:39:57 +00:00
Timur Kristóf	12652cc549	nir: Add pack_half_2x16_rtz_split opcode. Same as pack_half_2x16_rtz_split, but always uses RTZ mode. Note that pack_half_2x16 rounding mode is unspecified. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15838>	2023-01-26 12:24:24 +00:00
Alyssa Rosenzweig	a49ba0f1ae	nir: Add Midgard-specific fsin/fcos ops NIR has a fsin instruction that takes an argument in radians. Midgard instead has an fsinpi argument that takes an argument in multiples of pi. So, we had a NIR pass that would change fsin(x) to fsin(x / pi) and then map fsin to fsinpi in the backend. But that's invalid! In NIR, the opcode fsin is well-defined. fsin(x) means something very different than fsin(x / pi). They won't usually be equal. The transform fsin(x) -> fsin(x / pi) is fundamentally unsound. It did work before, by accident. Most NIR passes don't care about the semantics of ALU instructions. fsin(x) and fsin(x / pi) are both well-defined but fundamentally different NIR shaders. So while rewriting is wrong -- the NIR we get out is not equivalent to the NIR we put in, and the Midgard ops we generate are not equivalent to the NIR -- but if we don't run any passes that care about the definition of fsin the two wrongs will cancel out to make a right. However, some NIR passes do care about the definitions of ALU instructions, instead of treating them as named black boxes. In particular, constant folding (nir_opt_constant_fold) evaluates ALU instructions when their inputs are constants, according to the definition in nir_opcodes.py. So our little charade will only work if we don't call nir_opt_constant_fold, or if all the fsin instructions have non-constant inputs. At the beginning of this series, that is the case. With the later scalarization change, that's no longer the case, and the unsoundness translates to real failing tests rather than a quibble of NIR's semantics. To mitigate, we define a new NIR opcode with the semantics we want and translate fsin(x) = fsin_mdg(x / pi), where that equivalence does hold mathematically. So the new translation is sound and doesn't rely on lucky pass ordering. This matches the approach already used for AMD and AGX, which have fsin_amd and fsin_agx opcodes respectively. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Italo Nicola <italonicola@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19350>	2023-01-16 22:20:43 +00:00
Ian Romanick	eb76cee9f8	nir: Eliminate nir_op_i2b There are a lot of optimizations in opt_algebraic that match ('ine', a, 0), but there are almost none that match i2b. Instead of adding a huge pile of additional patterns (including variations that include both ine and i2b), always lower i2b to a != 0. At this point in the series, it should be impossible for anything to generate i2b, so there /should not/ be any changes. The failing test on d3d12 is a pre-existing bug that is triggered by this change. I talked to Jesse about it, and, after some analysis, he suggested just adding it to the list of known failures. v2: Don't rematerialize i2b instructions in dxil_nir_lower_x2b. v3: Don't rematerialize i2b instructions in zink_nir_algebraic.py. v4: Fix zink-on-TGL CI failures by calling nir_opt_algebraic after nir_lower_doubles makes progress. The latter can generate b2i instructions, but nir_lower_int64 can't handle them (anymore). v5: Add back most of the hunk at line 2125 of nir_opt_algebraic.py. I had accidentally removed the f2b(bf2(x)) optimization. v6: Just eliminate the i2b instruction. v7: Remove missed i2b32 in midgard_compile.c. Remove (now unused) emit_alu_i2orf2_b1 function from sfn_instr_alu.cpp. Previously this function was still used. 🤷 No shader-db changes on any Intel platform. All Intel platforms had similar results. (Ice Lake shown) Instructions in all programs: 141165875 -> 141165873 (-0.0%) Instructions helped: 2 Cycles in all programs: 9098956382 -> 9098956350 (-0.0%) Cycles helped: 2 The two Vulkan shaders are helped because of the "new" (('b2i32', ('ine', ('ubfe', a, b, 1), 0)), ('ubfe', a, b, 1)) algebraic pattern. Acked-by: Jesse Natalie <jenatali@microsoft.com> [earlier version] Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Daniel Schürmann <daniel@schuermann.dev> [earlier version] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>	2022-12-14 06:23:21 +00:00
Georg Lehmann	741dbadae0	nir: Fix ifind_msb_rev constant folding. For example if src0 is 0x80000000 we should return 1, not 0. Signed-off-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `a5747f8ab3` ("nir: add opcodes for *find_msb_rev and lowering") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18951>	2022-10-22 11:57:55 +02:00
Rhys Perry	bb0415b697	nir: allow 16-bit fsin_amd/fcos_amd Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10587>	2022-07-07 22:18:08 +00:00
Rhys Perry	69d21a3dee	nir: rename fsin_r600/fcos_r600 to fsin_amd/fcos_amd GCN has better range, but constant folding is the same. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10587>	2022-07-07 22:18:08 +00:00
Iago Toral Quiroga	84a0dca9df	nir: fix documentation for uadd_carry and usub_borry opcodes These opcodes where fixed to return an integer instead of a boolean value some time ago but the documentation for them was not updated and still talked about a boolean result. Fixes: `b0d4ee520` ('nir/opcodes: Fix up uadd_carry and usub_borrow') Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17372>	2022-07-07 09:16:24 +00:00
Ian Romanick	fd1f2d3b5a	nir: Add and use algebraic property "is selection" There are several places that should have supported the various sized versions of bcsel and the various nir_op_[fi]csel_* opcodes. Rather than enumerate the whole list, add a property. v2: Make the comment for NIR_OP_IS_SELECTION more descriptive. Suggested by Jason. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17048>	2022-06-22 19:26:59 +00:00
Ian Romanick	ccd18ec4f3	nir: i32csel opcodes should compare with integer zero Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Noticed-by: Georg Lehmann <dadschoorse@gmail.com> Fixes: `0f5b3c37c5` ("nir: Add opcodes for fused comp + csel and optimizations") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17048>	2022-06-22 19:26:59 +00:00
Jason Ekstrand	4b67d70d22	nir: Fix constant folding for non-32-bit ifind_msb and clz Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16348>	2022-05-10 03:37:44 +00:00
Jason Ekstrand	5c9e4d400a	nir/opcodes: fisfinite32 should return bool32 Otherwise constant-folding will fold it to 0/1 instead of 0/~0. Fixes: `330e28155f` ("nir: add 32-bit bool of fisfinite") Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15984>	2022-04-16 02:46:12 +00:00
Samuel Pitoiset	6532307555	nir: introduce nir_pack_{sint,uint}_2x16 instructions These instructions have AMD hardware equivalent and they will be used to lower fragment shader outputs in NIR. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15231>	2022-03-04 08:06:56 +00:00
Emma Anholt	b1f349dff4	nir: Allow the _replicates opcodes to have num_components != 4. This required relaxing a core NIR assertion which I don't think is doing any important validation. The shader-db effects here are small, but they're important for avoiding a regression when we start doing per-component DCE in opt_shrink_vectors (https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12468) softpipe shader-db: total instructions in shared programs: 2859777 -> 2859454 (-0.01%) instructions in affected programs: 18881 -> 18558 (-1.71%) total temps in shared programs: 293994 -> 293914 (-0.03%) temps in affected programs: 418 -> 338 (-19.14%) i915g: total instructions in shared programs: 407562 -> 407544 (<.01%) instructions in affected programs: 570 -> 552 (-3.16%) r300: total instructions in shared programs: 1414450 -> 1414459 (<.01%) instructions in affected programs: 44494 -> 44503 (0.02%) total vinst in shared programs: 473782 -> 473727 (-0.01%) vinst in affected programs: 1102 -> 1047 (-4.99%) total sinst in shared programs: 231224 -> 231216 (<.01%) sinst in affected programs: 432 -> 424 (-1.85%) total temps in shared programs: 197605 -> 197607 (<.01%) temps in affected programs: 103 -> 105 (1.94%) crocus hsw: total instructions in shared programs: 8158185 -> 8158134 (<.01%) instructions in affected programs: 10927 -> 10876 (-0.47%) Acked-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15178>	2022-02-25 12:31:48 -08:00
Ian Romanick	38800b385c	nir: All set-on-comparison opcodes can take all float types Extend `4195a9450b` so that the next poor fool doesn't come along and say, "sge does the right thing for 16-bit sources, but slt gives a NIR validation failure. What the deuce?" NOTE: This commit is necessary to prevent regressions in GLSLstd450Step tests of 16-bit sources at "spriv: Produce correct result for GLSLstd450Step with NaN". Fixes: `4195a9450b` ("nir: sge operation is defined for floating-point types") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13999>	2022-02-10 18:15:39 +00:00
Rhys Perry	7f05ea3793	nir: add nir_op_fmulz and nir_op_ffmaz Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13436>	2022-01-20 22:54:42 +00:00
Samuel Pitoiset	011ea32585	nir: fix constant expression of ibitfield_extract This fixes dEQP-VK.graphicsfuzz.cov-condition-bitfield-extract-integer. For example, nir_ibitfield_extract(3, 1, 2) should return 1. Cc: 21.3 mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13791>	2021-11-16 17:32:21 +00:00
Alyssa Rosenzweig	3e8f540753	nir: Add Mali-specific derivative opcodes Add derivative opcodes fddx_must_abs_mali/fddy_must_abs_mali satisfying: fabs(fdd_must_abs_mali(v)) = fabs(fdd(v)) The sign of their result is undefined. On Bifrost and Valhall, these unsigned derivatives can be implemented more efficiently than the correctly-signed counterparts, since the sign fixup requires extra ALU instructions. On backends where this is the case, it is useful to optimize fabs(fdd(v)) to fabs(fdd_must_abs_mali(v)). This pattern comes up with the GLSL builtin `fwidth`. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12332>	2021-10-06 00:40:57 +00:00
Rhys Perry	41ecef7855	nir: add sdot_2x16 and udot_2x16 opcodes Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12617>	2021-09-03 13:21:27 +00:00
Timur Kristóf	33630090a2	nir: Add comment to explain the sad_u8x4 opcode. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12649>	2021-09-01 08:42:03 +00:00
Filip Gawin	46f3582c6f	nir: fix ifind_msb_rev by using appropriate type As you can see comparion "x < 0" doesn't make sense if x is unsigned. Fixes: `a5747f8a` ("nir: add opcodes for *find_msb_rev and lowering ") Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12548>	2021-08-26 18:35:31 +00:00
Ian Romanick	6c18a3b497	nir/opcodes: Add integer dot-product opcodes Six opcodes are added: sdot_4x8_iadd, udot_4x8_uadd, sudot_4x8_iadd, sdot_4x8_iadd_sat, udot_4x8_uadd_sate, and sudot_4x8_iadd_sat. These represent the combinations of integer dot-product and add that operate on packed source vectors. That is, the four 8-bit values for each vector is stored in a single 32-bit integer. Some hardware may prefer to operate on unpacked byte vectors. When such hardware comes to Mesa, we'll have to figure out how to name things. v2: Add nir_op_iudp4a and nir_op_iudp4a_sat instructions. These opcodes are not 2-source commutative. v3: Rename all opcodes to be more like some existing 4x8 opcodes. Suggested by Timur. Change type of packed vector sources to uint32, change types of constant folding variables to have explicit size, and delete some extra casts. All suggested by Jason. v4: Fix typo previously noticed by Alyssa but missed in v2. v5: Add has_sudot_4x8 flag. Requested by Rhys. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142>	2021-08-24 19:58:57 +00:00
Ian Romanick	f0a8a9816a	nir: intel/compiler: Add and use nir_op_pack_32_4x8_split A lot of CTS tests write a u8vec4 or an i8vec4 to an SSBO. This results in a lot of shifts and MOVs. When that pattern can be recognized, the individual 8-bit components can be packed much more efficiently. v2: Rebase on `b4369de27f` ("nir/lower_packing: use shader_instructions_pass") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>	2021-08-18 22:03:37 +00:00
Vinson Lee	8d679f4f4e	nir: Initialize evaluate_cube_face_index_amd dst.x. Fix defect reported by Coverity Scan. Uninitialized scalar variable (UNINIT) uninit_use: Using uninitialized value dst.x. Fixes: `a1a2a8dfda` ("nir: add AMD_gcn_shader extended instructions") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12290>	2021-08-12 23:13:52 -07:00
Ian Romanick	3ba66ebbc8	nir/opcodes: Use u_intN_(min\|max) uadd_sat was updated using sed, so I didn't even notice the surrounding opcodes. Oops. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12297>	2021-08-10 22:16:13 +00:00
Rhys Perry	e008eb1224	nir: fix signed overflow for iadd constant folding Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12039>	2021-08-09 11:00:39 +00:00
Dave Airlie	330e28155f	nir: add 32-bit bool of fisfinite Add the bool lowering as well. Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12207>	2021-08-06 12:06:21 +10:00
Ian Romanick	72259a870f	util: Add and use functions to calculate min and max int for a size Many places need to know the maximum or minimum possible value for a given size integer... so everyone just open-codes their favorite version. There is some potential to hit either undefined or implementation-defined behavior, so having one version that Just Works seems beneficial. v2: Fix copy-and-pasted bug (INT64_MAX instead of INT64_MIN) in u_intmin. Noticed by CI. Lol. Rename functions `s/u_(uint\|int)(min\|max)/u_\1N_\2/g`. Suggested by Jason. Add some unit tests that would have caught the copy-and-paste bug before wasting CI time. Change the implementation of u_intN_min to use the same pattern as stdint.h. This avoids the integer division. Noticed by Jason. v3: Add changes to convert_clear_color (src/gallium/drivers/iris/iris_clear.c). Suggested by Nanley. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12177>	2021-08-03 12:55:02 -07:00
Sagar Ghuge	e8dff256c0	nir: Add new opcode for ternary addition v2: - Make it 2src commutative (Connor Abbott) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11596>	2021-07-16 15:59:55 +00:00
Thomas H.P. Andersen	ffea622604	nir/ifind_msb_rev: fix input check ifind_msb_rev was introduced in `a5747f8ab3`. ifind_msb_rev guards against src0 being both 0 or -1 at the same time. That is always true. This patch changes it to check for those values individually. Spotted from a compile warning. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Fixes: `a5747f8ab3` (\"nir: add opcodes for *find_msb_rev and lowering\") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11630>	2021-07-04 12:17:58 +00:00
Alyssa Rosenzweig	3da23a9c7e	nir: Fix constant folding for irhadd/urhadd This should be a subtract, not an add. The comment's proof is correct, but the (wrong) expression we actually use isn't what it's in the comment! Correct the discrepancy. The lowering in nir_opt_algebraic was correctly typed. Fixes: `272e927d0e` ("nir/spirv: initial handling of OpenCL.std extension opcodes") Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11671>	2021-07-02 00:21:22 +00:00
Jason Ekstrand	f00b5a30f5	nir: Require vectorized ALU ops to be all-or-nothing Long ago, the semantics of bcsel were such that it took a single boolean value and selected between whole vectors. These days, it takes a vector boolean with the assumption that if you want the old behavior you can just use a .xxxx swizzle. There currently are no opcodes which use a output_size of 0 but have a scalar or fixed-vector input. Let's disallow it for now to force us to think through the semantics again if this ever comes up as something someone actually wants. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11438>	2021-06-21 16:46:59 +00:00
Jason Ekstrand	2e08bae9b3	nir,vc4: Suffix a bunch of unorm 4x8 opcodes _vc4 Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11463>	2021-06-21 09:04:08 -05:00
Jason Ekstrand	0afbfee8da	nir,panfrost: Suffix fsat_signed and fclamp_pos with _mali Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11463>	2021-06-21 09:03:34 -05:00
Jason Ekstrand	f0f713960b	nir,amd: Suffix nir_op_cube_face_coord/index with _amd Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11463>	2021-06-21 09:03:34 -05:00
Timur Kristóf	c92dab8e2b	nir: Add nir_op_sad_u8x4 which corresponds to AMD's v_sad_u8. NIR currently doesn't have any intrinsics for a horizontal packed add, so this one is modeled after AMD's v_sad_u8. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072>	2021-06-09 16:48:51 +00:00
Rhys Perry	1cbcfb8b38	nir, nir/algebraic: add byte/word insertion instructions Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>	2021-06-08 08:57:42 +00:00
Jesse Natalie	d7ca0319d7	nir: Add relaxed 24bit opcodes These are equivalent to the 32bit opcodes if there are no more efficient 24bit opcodes available, but inputs are guaranteed to already be 24bit, so the 24bit opcodes can be used instead if they exist and are efficient. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10549>	2021-05-05 22:06:42 +00:00
Alyssa Rosenzweig	a976101da5	nir/opcodes: Reword confusing comment Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10578>	2021-05-03 12:51:47 +00:00
Alyssa Rosenzweig	0ea67e57e5	nir: Add fsin_agx opcode Used to split up the fsin/fcos lowering for AGX between NIR and the backend, to permit algebraic optimizations without polluting NIR with too many hardware details. The backend NIR lowering produces an fmul/ffma of the input so we can optimize code like sin(2*x). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10582>	2021-05-02 17:41:09 -04:00
Jesse Natalie	3c8bcdc863	nir: Add a new opcode for [un]packing doubles HLSL doesn't support bitcasting a 64bit integer to a double. DXIL doesn't have generic pack/unpack instructions, so we lower those to integer bitwise ops. As a result, NIR generic double pack/unpack would require our backend to emit a bitcast to get a double, but we want to match HLSL semantics and emit MakeDouble/SplitDouble. Adding a dedicated opcode for double pack/unpack allows us to add a pass to emit that instead, which lets our backend emit the right instruction to pack and unpack doubles. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10063>	2021-04-09 01:54:33 +00:00
Gert Wollny	318701b803	nir: Add r600 specific sin and cos variants r600 expect the input values to be normalited by divinding by 2 *PI, so add an opcode to be able to lower this in nir. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9452>	2021-03-22 15:19:46 +01:00
Gert Wollny	0f5b3c37c5	nir: Add opcodes for fused comp + csel and optimizations Some backends, like r600 support a fused version of int and float compare against zero and and csel. Adding these opcodes here makes it possible to optimize this in nir. v2: Add rules for float compare + csel Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9452>	2021-03-22 15:19:46 +01:00
Gert Wollny	a5747f8ab3	nir: add opcodes for find_msb_rev and lowering Some hardware supports a version of find_msb where the bits are counted starting at the high bit, and this needs some lowering to obtain the value that is expected by find_msb Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9452>	2021-03-22 15:19:46 +01:00
Gert Wollny	e5db9c3dd4	nir: Add r600 specific CUBE opcode to evaluate cube texture coords and face The opcode evaluates tha unnormalized coordinates, the length of the major axis, and the cube face. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Acked-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9200>	2021-02-26 09:51:37 +01:00
Rhys Perry	95819663b7	nir: allow 5 component vectors These will be useful for sparse texture instructions and image load intrinsics. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7774>	2021-01-06 20:36:38 +00:00
Ian Romanick	71961c73a9	nir: Correctly constant fold fsign(NaN) and fsign(-0) GLSL and SPIR-V GLSL.std.450 don't have any requirements for fsign(NaN), and both only require that FSign(-0.0) == 0.0. OpenCL, on the other hand, requires sign(-0.0) be exactly -0.0. It also requires that sign(NaN) be exactly 0.0. In practice, this change is difficult to test. Our GLSL frontend already constant folds sign(NaN) to 0.0 before even getting to NIR. As far as I can tell, glslang does the same. I don't have a good way to run an OpenCL SPIR-V test. Maybe SPIR-V GLSL.std.450 assembly? No shader-db or fossil-db changes on any Intel platform. Acked-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00
Ian Romanick	363efc2823	nir: Make some notes about fsign versus NaN This commit only documents the current behavior, even if that behavior is not the behavior preferred by the relevant specs. In SPIR-V, there are two flavors of the sign instruction, and each lives in an extended instruction set. The GLSL.std.450 FSign instruction is defined as: Result is 1.0 if x > 0, 0.0 if x = 0, or -1.0 if x < 0. This also matches the GLSL 4.60 definition. However, the OpenCL.ExtendedInstructionSet.100 sign instruction is defined as: Returns 1.0 if x > 0, -0.0 if x = -0.0, +0.0 if x = +0.0, or -1.0 if x < 0. Returns 0.0 if x is a NaN. There are two differences. Each treats -0.0 differently, and each also treats NaN differently. Specifically, GLSL.std.450 FSign does not define any specific behavior for NaN. There has been some discussion in Khronos about the NaN behavior of GLSL.std.450 FSign. As part of that discussion, I did some research into how we treat NaN for nir_op_fsign, and this commit just captures some of those notes. v2: Document the expected behavior of nir_op_fsign more thoroughly. Suggested by Rhys. Note that the current implementation of constant folding does not produce the expected result for NaN. Suggested by Caio. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> [v1] Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00

1 2 3 4

155 commits