Commit graph

1027 commits

Author SHA1 Message Date
Faith Ekstrand
116a851264 nir: Add mode filtering to lower_mem_access_bit_sizes
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21524>
2023-03-03 02:00:39 +00:00
Georg Lehmann
a00b50d820 nir: change 16bit image dest folding option to per type
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21404>
2023-02-27 09:55:34 +00:00
Alyssa Rosenzweig
8058d31a25 nir: Add nir_texop_lod_bias_agx
Add a new texture opcode that returns the LOD bias of the sampler. This will be
used on AGX to lower sampler LOD bias to txb and friends. This needs to be a
texture op (and not a new intrinsic) to handle both bindless and bindful
samplers across GL and Vulkan in a uniform way.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21276>
2023-02-27 02:35:41 +00:00
Faith Ekstrand
e41753cf17 nir/lower_io: Handle buffer_array_length for more address modes
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21446>
2023-02-24 20:37:10 +00:00
Caio Oliveira
3328714295 nir/lower_subgroups: Add option lower_rotate_to_shuffle
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19797>
2023-02-24 06:33:51 +00:00
Daniel Schürmann
c20751d61d nir: add lowering for Loop Continue Constructs
This pass lowers Loop Continue Constructs to the previous solution
by inserting it at the beginning of the loop:

loop {
   if (i != 0) {
      continue construct
   }
   loop body
}

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13962>
2023-02-21 10:41:11 +00:00
Daniel Schürmann
d4b97bf3fa nir: add Continue Construct to nir_loop
The added continue_list corresponds to the SPIR-V
Continue Construct and serves as a converged control-flow
construct and is executed after each continue statement
and before the next iteration of the loop body.

Also adds validation rules for loops with Continue Construct

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13962>
2023-02-21 10:41:11 +00:00
Faith Ekstrand
2e2d7803c7 nir: Add a load/store bit size lowering pass
This is based on brw_nir_lower_mem_access_bit_sizes() but ended up being
substantially different.  While the core concepts are all the same, the
brw_* version made a lot of Intel-specific assumptions.  The new version
takes a callback which takes a number of bytes of data and an alignment
pair and returns a bit size and number of components to load/store.

Reviewed-by: M Henning <drawoc@darkrefraction.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21232>
2023-02-17 00:55:54 +00:00
Jesse Natalie
25ee07373c nir_lower_fp16_casts: Allow opting out of lowering certain rounding modes
Reviewed-by: Giancarlo Devich <gdevich@microsoft.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21029>
2023-02-11 06:12:23 +00:00
Ian Romanick
682e83f012 nir/inline_uniforms: Make add_inlinable_uniforms public
This is step 5 in an attempt to unify a bunch of nir_inline_uniforms.c
and lvp_inline_uniforms.c code.

Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21179>
2023-02-10 03:18:23 +00:00
Ian Romanick
cdd23b1efa nir/inline_uniforms: Make src_only_uses_uniforms public, change name
While making the function public, rename it to
nir_collect_src_uniforms. The old name makes it sound like it's just a
query that doesn't have side effects. That is, however, not the case.

This is step 4 in an attempt to unify a bunch of nir_inline_uniforms.c
and lvp_inline_uniforms.c code.

Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21179>
2023-02-10 03:18:23 +00:00
Jason Ekstrand
9c62e0c77d nir: Remove nir_lower_io_force_sample_interpolation
It's no longer used.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>
2023-02-06 09:12:17 +00:00
Alyssa Rosenzweig
071ac59960 nir: Add a late texcoord replacement pass
Add a second NIR pass for lowering point/texture coordinate replacement (i.e.
point sprites). Why a second one? The current pass works on derefs/variables,
which is good for drivers that don't lower I/O at all (like Zink, where the pass
originates). However, it is problematic for hardware drivers: the inputs to this
pass depend on the shader key, so we want to run the pass as late as possible to
minimize the cost of building/compiling the associated shader variants. In
particular, we need to be able to lower point sprites after lowering I/O if we
would like to lower I/O when preprocessing NIR.

The logic for early lowering and late lowering is considerably different (the
late lowering is a lot simpler), so I've split this out into a second pass
rather than trying to weld them together into one.

This pass will be used on Asahi, which currently uses the early pass. It may be
useful for other drivers as well. (Actually, it's been shipping on Asahi for a
little while now, just hasn't been sent upstream yet.)

Tested with Neverball.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Asahi Lina <lina@asahilina.net>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21065>
2023-02-03 15:03:06 +00:00
Amber
c384690ab7 nir: support lowering nir_intrinsic_image_samples to a constant load
This can be used by multiple drivers that do not support ms images

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Reviewer-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Signed-off-by: Amber Amber <amber@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20813>
2023-02-01 19:52:49 +00:00
Marcin Ślusarz
2255375c4d nir: add nir_mod_analysis & its tests
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20050>
2023-01-31 13:50:08 +00:00
Timur Kristóf
12652cc549 nir: Add pack_half_2x16_rtz_split opcode.
Same as pack_half_2x16_rtz_split, but always uses RTZ mode.
Note that pack_half_2x16 rounding mode is unspecified.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15838>
2023-01-26 12:24:24 +00:00
Gert Wollny
2e05cfa179 nir: Add range_base to atomic_counter and an option to use it
Some drivers may encode constant offsets in the instruction, so
make it possible for the drivers to request lowering the atomic
uniform offset into the range_base variable of the intrinsic.

v2: drop patch to use build-in array offset evaluation, it makes
    problems with zink, and update the code accordingly
v3: always initialize range base

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19980>
2023-01-17 13:19:04 +00:00
Gert Wollny
c4cde91c1b nir: Add possibility to store image var offset in range_base
Add the intrinsic range_base value to the image intrinsics and add
the option to store the image array offset into range_base instead
of adding it to the image array index if the driver requests it.

v2: Always initialize range_base

v3: fix for bindless intrinsics

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19980>
2023-01-17 13:19:04 +00:00
Pavel Ondračka
3fcdd9e4a7 nir/lower_bool: ntt: Generate a good opcode for bcsel
This is heavily copy-pasted from a patch of Ian Romanick, including the
commit message.

Previously, this pass always generated fcsel for bcsel.  This was the
only place that generate fcsel, so various drivers assumed (and needed!)
that src0 was a Boolean with 0.0 or 1.0 as the only values.

Specifically, many DX9 / GL_ARB_vertex_program platforms lack a CMP
instruction in vertex shaders.  In those cases, they would use LRP to
implement fcsel.  The bummer is that many plaforms have a real fcsel
instruction, and those platforms would benefit from other places
generating that opcode.

Instead of leaving assumptions in drivers about the sources of an opcode
that they can't really support, allow them to control the way the
lowering pass translates bcsel.  Two flags are used to control this:

- If the driver sets has_fused_comp_and_csel in nir_options, fcsel_gt
  will be used.  Since the Boolean value is 0.0 or 1.0, this is
  equivalent to fcsel.

- If the parameter has_fcsel_ne is set, fcsel will be used.  This is the
  old path.

- Otherwise, the lowering pass assumes we're on a crufty, old DX9 vertex
  program, and it emits flrp.

With this, the assumptions about src0 of fcsel in NTT can be removed.
If a platform can't handle fcsel, it should ensure that the lowering
pass won't generate it.

No change in shader-db.

Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20162>
2023-01-12 23:01:05 +00:00
Danylo Piliaiev
1c9ee30838 nir/fold_16bit_tex_image: Add type granularity for dst folding
Some HW may be able to fold only some of dst types, e.g.
for Adreno folding i32 -> i16 could cause a different result since
folded variant clamps the result instead of masking it.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20396>
2022-12-23 15:48:18 +01:00
Lionel Landwerlin
3af08b9c30 nir/divergence: handle shader_record_ptr intrinsic
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes 6b8fd65e84 ("spirv: Implement the new ray-tracing storage classes")

Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20413>
2022-12-23 09:22:13 +00:00
Danylo Piliaiev
8482ad0110 nir/nir_lower_is_helper_invocation: Lower helper invocation if required
nir_lower_is_helper_invocation lowers intrinsic_is_helper_invocation
and uses load_helper_invocation (which is lowered by nir_lower_system_values).
While nir_lower_system_values may lower SYSTEM_VALUE_HELPER_INVOCATION
into intrinsic_is_helper_invocation.

So they depend on each other. Break the dependency by making
nir_lower_is_helper_invocation aware of lower_helper_invocation option
and emitting lowered load_helper_invocation when required.

Happens with SPIR-V 1.6 for which gl_HelperInvocation is translated into
"BuiltIn HelperInvocation" + "Volatile", which nir_lower_system_values
translates into is_helper_invocation.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19677>
2022-12-20 11:06:52 +00:00
Qiang Yu
194add2c23 nir: lower image add lower_to_fragment_mask_load_amd option
Like lower_to_fragment_fetch_amd option in lower tex,
this is for radeonsi to lower MS image ops.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18666>
2022-12-19 09:22:16 +08:00
Qiang Yu
1461b5f61b nir: add image fragment mask load intrinsic
Like nir_texop_fragment_mask_fetch_amd, this is used to load multi
sample image fmask data for AMD GPU.

We will lower multi sample image load and samples_identical intrinsics
to use it latter for radeonsi. RADV does not need this because it
always expand fmask images before dispatch compute shader.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18666>
2022-12-19 09:22:11 +08:00
Marek Olšák
c0d69b40bc nir: add nir_texop_sampler_descriptor_amd
We'll use it to query the min/mag filter in the shader.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19422>
2022-12-13 20:33:05 +00:00
Rhys Perry
e1f5100311 nir: add task_payload and shader_out to nir_var_vec_indexable_modes
Since these can be cross-invocation, we need this to write individual
components without race conditions or loads.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7391
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19597>
2022-12-09 20:56:52 +00:00
Jason Ekstrand
9d43aebcad nir: Use nir_component_mask_t for nir_alu_dst::write_mask
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20193>
2022-12-06 18:37:19 -06:00
Marcin Ślusarz
f6adfd6278 nir/lower_task_shader: allow offsetting of the start of payload
We need this, because on Intel task payload starts with private header,
followed by user-accessible data.

Fixes: 37e78803d7 ("intel/compiler: use nir_lower_task_shader pass")

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19409>
2022-12-01 11:19:47 +00:00
Erik Faye-Lund
d0342e28b3 nir: Add helper to create passthrough GS shader
Based on nir_create_passthrough_tcs and d3d12_make_passthrough_gs, this
creates a passthrough geometry shader that can be used by drivers that
needs to emulate some graphics features in the geometry shader.

Reviewed-by: Rob Clark <robclark@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19987>
2022-11-30 08:08:25 +00:00
Lionel Landwerlin
9d0560fe87 nir/lower_shader_calls: enable vectorizer
We cannot fully use the vectorizer outside of this pass because once
stack load/store operations have been lower to global load/store, the
robustness rule applies to those as they would to application
load/store.

But this is all internal and we know it doesn't require out of bound
checking. So doing the vectorizing here is the best solution. We just
have to teach the vectorizer about our intrinsics.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20058>
2022-11-30 07:23:30 +00:00
Ian Romanick
2ba55ec504 nir/range_analysis: Set higher default maximum for max_workgroup_count
Fixes: c2a81ebe19 ("nir: Add default unsigned upper bound configuration.")
Closes: #7676
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19835>
2022-11-19 05:40:42 +00:00
Konstantin Seurer
f5b6576585 nir: Add a pass for combining ray queries
We can determice scopes/ranges of the use of ray queries and use this information to combine ray queries.

Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16593>
2022-11-11 15:17:08 +00:00
Konstantin Seurer
d22037b96c nir: Add and use nir_intrinsic_is_ray_query helper
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16593>
2022-11-11 15:17:08 +00:00
Lionel Landwerlin
b499a27d74 nir: make ray query load values visible in NIR prints
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19641>
2022-11-10 14:40:08 +02:00
Karol Herbst
d459a58473 nir/lower_cl_images: support keeping derefs
This is needed by radeonsi and zink

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19381>
2022-11-10 10:21:34 +00:00
Caio Oliveira
8ab628ab2e nir: Don't reorder volatile intrinsics
Fixes issue with "is helper invocation" that in recent SPIR-V is mapped to
a volatile Load.  The CSE was catching the loads before they were transformed
in the new is_helper_invocation intrinsic (that is not reorderable).

Fixes: 729df14e45 ("nir: Handle volatile semantics for loading HelperInvocation builtin")
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19432>
2022-11-09 06:02:18 +00:00
Illia Abernikhin
aa4ac5ff8b utils: Merge util/debug.* into util/u_debug.* and remove util/debug.*
Rename env_var_as_unsigned() -> debug_get_num_option(), because duplicate
Rename env_var_as_bool() -> debug_get_bool_option(), because duplicate

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7177

Signed-off-by: Illia Abernikhin <illia.abernikhin@globallogic.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19336>
2022-11-02 07:25:39 +00:00
Kenneth Graunke
fde99747e9 nir: Drop infer_non_readable option for nir_opt_access()
Everybody sets it to true now, and the only reason for the option to
exist was to work around a bug that's now been fixed.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19162>
2022-11-02 03:42:04 +00:00
Rob Clark
5d3895d13b nir: Add way to create passthrough TCS without VS nir
In the case of disk-cache hits, radeonsi no longer has the nir shader
around.  So add a way to create a passthrough TCS with just the VS
output locations.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7567
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19382>
2022-10-29 17:46:23 +00:00
Alyssa Rosenzweig
941c37c085 nir/lower_idiv: Remove imprecise_32bit_lowering
NIR has two implementations of lower_idiv, keyed on the
imprecise_32bit_lowering flag. This flag is misleading: the results when
setting this flag "imprecise", they're completely wrong for some values.
If a backend has a native implementation of umul_high, the correct path
isn't that much more expensive. If it doesn't, it's substantially slower
for highp integer divison... but in practice, non-constant highp integer
division is pretty rare.

After a painful migration of the tree, this code path has no more users.
Remove it so nobody else gets the bright idea of using it again.

Closes: #6555
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19303>
2022-10-27 19:37:14 +00:00
Lionel Landwerlin
3c242e551d nir/lower_shader_calls: move scratch loads closer to where they're needed
The intel backend compiler is not dealing with the scratch loads
emitted by this pass very well. There are 2 reasons for this :

  - all loads are at the top of the shader

  - the loads are global load intrinsics (cannot be differentiated
    from ssbo loads for example)

This leads the backend to generate ridiculous amount of spills.

To help a bit (actually quite a lot), we can move the scratch loads in
the blocks where they're needed, using the dominance information.
Quite often that also ends up moving loads in a block that might not
be reached by all the lanes, so we're potentially avoiding some loads.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16556>
2022-10-26 12:53:25 +00:00
Lionel Landwerlin
1d10d17817 nir/lower_shader_calls: add an option structure for future optimizations
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16556>
2022-10-26 12:53:25 +00:00
Rob Clark
a8e84f50bc nir: Add helper to create passthrough TCS shader
Based on si_create_passthrough_tcs() as that seemed the most generic of
the various different backend driver implementations.  Uses the
load_tess_level_outer_default and load_tess_level_inner_default
intrinsics to load the gl_TessLevelOuter and gl_TessLevelInner values,
so driver will somehow need to implement those to load the values set
by pipe_context::set_tess_state() or similar.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19259>
2022-10-24 21:39:38 +00:00
Georg Lehmann
125741dbae nir/opt_algebraic: Optimize various find_msb_rev patterns.
From dxvk, dxil-spirv, fxc, dxc and others.

Totals from 177 (0.13% of 134913) affected shaders:
CodeSize: 1079504 -> 1059872 (-1.82%)
Instrs: 195381 -> 192269 (-1.59%)
Latency: 3664137 -> 3631951 (-0.88%)
InvThroughput: 599479 -> 585675 (-2.30%)

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18951>
2022-10-22 11:57:33 +02:00
Georg Lehmann
7505be3497 nir/opt_algebraic: Add an option to lower uclz.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18951>
2022-10-22 11:57:10 +02:00
Yonggang Luo
a9da108c6b nir: No need redefine snprintf anymore in nir.h
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18685>
2022-10-18 03:16:00 +00:00
Timur Kristóf
c0d0a7c176 nir: Add selection control enum for always taken divergent branches.
The new enum is called nir_selection_control_divergent_always_taken,
and it's almost the same as nir_selection_control_flatten.
The main difference between the two is that "flatten" represents
a choice made by the application but "divergent_always_taken" may
be applied by the compiler stack when it thinks this is beneficial.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-By: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17921>
2022-10-11 15:42:54 +00:00
Timur Kristóf
a2ec843727 nir: Document the flatten/dont_flatten selection control options.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-By: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17921>
2022-10-11 15:42:53 +00:00
SoroushIMG
1e8e785a07 nir: allow to fine tune unrolling for loops with soft fp64 ops
Lowered fp64 ops can blow up the loop bodies while still being
suitable for unrolling.
Allow for using different parameters to unroll loops with
soft fp64.

Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18863>
2022-09-30 17:07:37 +00:00
SoroushIMG
121f30005f nir: track whether a loop contains soft fp64 ops
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18863>
2022-09-30 17:07:37 +00:00