Commit graph

4145 commits

Author SHA1 Message Date
Giancarlo Devich
f9a827d61e nir: Check sampler_binding is valid when lowering tex shadow
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21247>
2023-02-13 22:57:03 +00:00
Bas Nieuwenhuizen
0a17c3afc5 nir: Apply a maximum stack depth to avoid stack overflows.
A stackless (or at least using allocated memory for stack) version
might be nice but for now this works around some games compiling
large shaders and hitting stack overflows.

CC: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21231>
2023-02-11 15:01:42 +01:00
Jesse Natalie
25ee07373c nir_lower_fp16_casts: Allow opting out of lowering certain rounding modes
Reviewed-by: Giancarlo Devich <gdevich@microsoft.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21029>
2023-02-11 06:12:23 +00:00
Jesse Natalie
c0c2b60f1d nir: Add alignment to load_push_constant
Reviewed-by: Giancarlo Devich <gdevich@microsoft.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21029>
2023-02-11 06:12:23 +00:00
Faith Ekstrand
af9212dd82 nir/deref: Preserve alignments in opt_remove_cast_cast()
This also removes the loop so opt_remove_cast_cast() will only optimize
cast(cast(x)) and not cast(cast(cast(x))).  However, since nir_opt_deref
walks instructions top-down, there will almost never be a tripple cast
because the parent cast will have opt_remove_cast_cast() run on it.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21252>
2023-02-10 23:08:19 +00:00
Pavel Ondračka
94eff7ccd8 nir: shrink phi nodes in nir_opt_shrink_vectors
While this change helps with few shaders, the main benefit is
that it allows to unroll loops comming from nine+ttn on vec4
backends. D3D9 REP ... ENDREP type loops are unrolled now already,
LOOP ... ENDLOOP need some nine changes that will come later.

r300 RV530 shader-db:
total instructions in shared programs: 132481 -> 132344 (-0.10%)
instructions in affected programs: 3532 -> 3395 (-3.88%)
helped: 13
HURT: 0

total temps in shared programs: 16961 -> 16957 (-0.02%)
temps in affected programs: 88 -> 84 (-4.55%)
helped: 4
HURT: 0

Reviewed-by: Emma Anholt <emma@anholt.net>
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Partial fix for: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8102
Partial fix for: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7222

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21038>
2023-02-10 09:06:25 +00:00
Ian Romanick
18fc4daaf6 nir/inline_uniforms: Add inot condition support
From the 96c19d23c9 commit message:

    Ever since 4246c2869c and 7d85dc4f35 loop unrolling can no
    longer depend on inot being eliminated from the loop
    terminator condition so we need to be able to handle it.

Support these conditions here too.

Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21179>
2023-02-10 03:18:23 +00:00
Ian Romanick
682e83f012 nir/inline_uniforms: Make add_inlinable_uniforms public
This is step 5 in an attempt to unify a bunch of nir_inline_uniforms.c
and lvp_inline_uniforms.c code.

Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21179>
2023-02-10 03:18:23 +00:00
Ian Romanick
cdd23b1efa nir/inline_uniforms: Make src_only_uses_uniforms public, change name
While making the function public, rename it to
nir_collect_src_uniforms. The old name makes it sound like it's just a
query that doesn't have side effects. That is, however, not the case.

This is step 4 in an attempt to unify a bunch of nir_inline_uniforms.c
and lvp_inline_uniforms.c code.

Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21179>
2023-02-10 03:18:23 +00:00
Ian Romanick
edb89b71c5 nir/inline_uniforms: Allow possibility of uni_offsets and num_offsets being NULL
This is step 3 in an attempt to unify a bunch of nir_inline_uniforms.c
and lvp_inline_uniforms.c code.

Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21179>
2023-02-10 03:18:23 +00:00
Ian Romanick
0c0fb216dd nir/inline_uniforms: Allow possibility of more than one UBO
Only caller in this file still only passes 1.

This is step 2 in an attempt to unify a bunch of nir_inline_uniforms.c
and lvp_inline_uniforms.c code.

Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21179>
2023-02-10 03:18:23 +00:00
Ian Romanick
23b4266f9e nir/inline_uniforms: Pass max_num_bo and max_offset around as parameters
max_num_bo is currently limited to 1. That will change in the next
commit.

This is step 1 in an attempt to unify a bunch of nir_inline_uniforms.c
and lvp_inline_uniforms.c code.

Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21179>
2023-02-10 03:18:23 +00:00
Ian Romanick
1d5033823e nir/inline_uniforms: Change num_offsets type to uint8_t
This is step 0 in an attempt to unify a bunch of nir_inline_uniforms.c
and lvp_inline_uniforms.c code.

Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21179>
2023-02-10 03:18:23 +00:00
Alejandro Piñeiro
3685528c1e nir: track if var copies lowering was called
In general we should only call it once, and then we should avoid to
call any lowering that introduce back copies. So far we were tracking
that manually out of the nir shader on several places.

Ideally we would like to add a nir_validate rule, but right now there
are some exceptions to this rule. For example right now the Intel
compiler calls nir_lower_io_to_temporaries as part of linking
tess_ctrl/mesh/task sahders.

One option would be to allow drivers to reset the value, but for now
let's not add that validation rule.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19338>
2023-02-06 22:11:34 +00:00
Konstantin Seurer
9104dafb6f vulkan,nir: Refactor ycbcr conversion state into a struct
This will be useful for RADV since it hashes the state.

v3dv changes:
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20731>
2023-02-06 18:36:29 +00:00
Jason Ekstrand
9c62e0c77d nir: Remove nir_lower_io_force_sample_interpolation
It's no longer used.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>
2023-02-06 09:12:17 +00:00
Alyssa Rosenzweig
6b97f396e6 nir/lower_clip: Only emit 1 discard
If we have multiple clip planes, rather than emit multiple discards we can just
OR together the discard criteria. Then a nir_opt_algebraic rule kicks in to
optimize out the flt/.../flt/ior/.../ior into fmin/.../fmin/flt, generating
much less code at the end.

Written while debugging an unrelated issue with the clip lowering.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21103>
2023-02-06 02:50:20 +00:00
Alyssa Rosenzweig
93db6094a1 nir/print: Pretty-print color0/1_interp
These are an enum. Furthermore, their 0 state is INTERP_MODE_NONE which we
shouldn't bother printing at all.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21091>
2023-02-04 17:26:30 +00:00
Alyssa Rosenzweig
b235be1fd4 nir/print: Pretty-print I/O semantic locations
Instead of printing the raw location number, which is pretty hard to interpret,
let's print the name of the location. Example output:

   vec4 16 ssa_2 = intrinsic load_interpolated_input (ssa_0, ssa_1) (base=0,
   component=0, dest_type=float16 /*144*/, io location=VARYING_SLOT_VAR0 slots=1
   mediump /*8388768*/)

One of the "regressions" from moving to purely lowered I/O with all variables
removed is a lack of debuggability, since otherwise these location strings don't
show up anywhere in the printed shader! By contrast this should make the lowered
I/O nice to read like the early I/O.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21091>
2023-02-04 17:26:30 +00:00
Alyssa Rosenzweig
435e7f5e6d nir/print: Extract get_location_str
Locations show up in two places: variables and lowered I/O semantics. We want to
reuse the logic in both places, so extract it out. The extracted logic is IMO
easier to read, too.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21091>
2023-02-04 17:26:30 +00:00
Hampus Linander
4ffc7c3ff4 nir: Add extr_agx opcode
The AGX extr instruction extracts a bitfield from two 32bit registers.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20628>
2023-02-04 11:13:24 -05:00
Ian Romanick
ea413e826b nir: Eliminate nir_op_f2b
Builds on the work of !15121.  This gets to delete even more code
because many drivers shared a lot of code for i2b and f2b.

No shader-db or fossil-db changes on any Intel platform.

v2: Rebase on 1a35acd8d9.

v3: Update a comment in nir_opcodes_c.py. Suggested by Konstantin.

v4: Another rebase. Remove f2b stuff from Midgard.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20509>
2023-02-03 22:39:57 +00:00
Ian Romanick
024122c069 nir/builder: Handle f2b conversions specially in nir_type_convert
No shader-db or fossil-db changes on any Intel platform.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20509>
2023-02-03 22:39:57 +00:00
Ian Romanick
b265020b82 nir/builder: Eliminate nir_f2b helper (and use of nir_f2b32 helper)
There were only two users. Replace each with nir_fneu instead.

This is now a squash of what was two separate commits.
nir_lower_pstipple_block is called after nir_lower_bool_to_int32, so
nir_fneu32 has to be used here or there will be regresssions in stipple
tests on llvmpipe.

v2: Rebase on !20869.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Suggested-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20509>
2023-02-03 22:39:57 +00:00
Alyssa Rosenzweig
071ac59960 nir: Add a late texcoord replacement pass
Add a second NIR pass for lowering point/texture coordinate replacement (i.e.
point sprites). Why a second one? The current pass works on derefs/variables,
which is good for drivers that don't lower I/O at all (like Zink, where the pass
originates). However, it is problematic for hardware drivers: the inputs to this
pass depend on the shader key, so we want to run the pass as late as possible to
minimize the cost of building/compiling the associated shader variants. In
particular, we need to be able to lower point sprites after lowering I/O if we
would like to lower I/O when preprocessing NIR.

The logic for early lowering and late lowering is considerably different (the
late lowering is a lot simpler), so I've split this out into a second pass
rather than trying to weld them together into one.

This pass will be used on Asahi, which currently uses the early pass. It may be
useful for other drivers as well. (Actually, it's been shipping on Asahi for a
little while now, just hasn't been sent upstream yet.)

Tested with Neverball.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Asahi Lina <lina@asahilina.net>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21065>
2023-02-03 15:03:06 +00:00
Qiang Yu
f6b194b648 nir,ac/llvm,aco,radv,radeonsi: remove nir_export_vertex_amd
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20691>
2023-02-03 12:27:44 +00:00
Qiang Yu
f44872c7b6 nir,ac/llvm,aco: remove nir_export_primitive_amd
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20691>
2023-02-03 12:27:44 +00:00
Qiang Yu
5f24d58549 nir: add nir_export_amd intrinsic
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20691>
2023-02-03 12:27:43 +00:00
Sagar Ghuge
0ec3522163 nir: Handle other variants of image_samples properly while lowering
while lowering image_samples to one, we need to take
nir_intrinsic_image_deref_samples and
nir_intrinsic_bindless_image_samples intrinsic into account.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8211

Fixes: ab4c2990ed ("intel/compiler: use lower_image_samples_to_one")
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21053>
2023-02-02 21:40:45 +00:00
Timur Kristóf
1244506c15 nir/opt_algebraic: Add optimization for ieq/ine and right-shift.
Fossil DB stats on GFX11:
Totals from 1343 (1.00% of 134913) affected shaders:
SpillSGPRs: 7145 -> 7137 (-0.11%)
CodeSize: 20737744 -> 20739148 (+0.01%); split: -0.02%, +0.03%
Instrs: 4010443 -> 4008449 (-0.05%); split: -0.05%, +0.00%
Latency: 50021520 -> 50021105 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 6354371 -> 6354112 (-0.00%); split: -0.00%, +0.00%
VClause: 63035 -> 63038 (+0.00%); split: -0.01%, +0.01%
SClause: 121162 -> 121166 (+0.00%)
Copies: 251354 -> 251058 (-0.12%); split: -0.18%, +0.06%
PreSGPRs: 137283 -> 137299 (+0.01%)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20936>
2023-02-02 03:08:19 +00:00
Pavel Ondračka
7e6acfd587 nir: mark progress when removing trailing unused load_const channels
When the unused channels were at the end and so no reswizzling was
needed, we wouldn't correctly mark the progress.

Fixes: 3305c960
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21014>
2023-02-01 20:33:31 +00:00
Pavel Ondračka
fe56dd9c42 nir: mark progress when removing trailing unused alu channels
When the unused channels were at the end and so no reswizzling was
needed, we wouldn't correctly mark the progress.

Fixes: cb7f2012
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21014>
2023-02-01 20:33:31 +00:00
Pavel Ondračka
ef800da3f7 nir: nir opt_shrink_vectors whitespace fix
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21014>
2023-02-01 20:33:31 +00:00
Amber
c384690ab7 nir: support lowering nir_intrinsic_image_samples to a constant load
This can be used by multiple drivers that do not support ms images

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Reviewer-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Signed-off-by: Amber Amber <amber@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20813>
2023-02-01 19:52:49 +00:00
Alyssa Rosenzweig
b0b5a71c74 nir/opt_preamble: Consider load_preamble as movable
It's kosher to get load_preamble intrinsics ahead of time if the driver is
pushing sysvals. Handle them like load_uniform.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by-(with-sparkles): Asahi Lina <lina@asahilina.net>

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20562>
2023-01-31 17:02:34 +00:00
Alyssa Rosenzweig
05d3238692 nir/opt_preamble: Treat *size as an input
Some backends may wish to reserve early uniforms for internal system values, and
use the remaining space for preamble storage. In this case, it's convenient to
teach nir_opt_preamble about a reserved offset. It's logical to treat the output
*size instead of an in/out variable that nir_opt_preamble adds to. This requires
a slight change to the consumers to zero the input.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by-(with-sparkles): Asahi Lina <lina@asahilina.net>

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20562>
2023-01-31 17:02:34 +00:00
Marcin Ślusarz
2255375c4d nir: add nir_mod_analysis & its tests
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20050>
2023-01-31 13:50:08 +00:00
Erik Faye-Lund
f00c9e85e5 meson: use files() instead of joining paths
The Meson docs points out that it's better to use the files() function
when referring to files in the source tree than manually constructing
paths like this. Let's follow that advice, and get some neat cleanups.

Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20907>
2023-01-27 11:35:50 +00:00
Timur Kristóf
65a917cb6e nir: Add algebraic optimization for VKD3D-Proton fp32->fp16 conversion.
VKD3D-Proton DXBC f32 to f16 conversion implements a float conversion using PackHalf2x16.
Because the spec does not specify a rounding mode, it emits a sequence to ensure
D3D-like behaviour for infinity.

When we know the current backend has pack_half_2x16_rtz_split,
we can eliminate the extra sequence.

Fossil DB stats on GFX11:
Totals from 835 (0.62% of 134913) affected shaders:
VGPRs: 49368 -> 49224 (-0.29%)
CodeSize: 5341956 -> 5124564 (-4.07%)
Instrs: 1024062 -> 987041 (-3.62%)
Latency: 6530956 -> 6465120 (-1.01%); split: -1.01%, +0.00%
InvThroughput: 908189 -> 870253 (-4.18%)
VClause: 18704 -> 18702 (-0.01%); split: -0.02%, +0.01%
SClause: 33406 -> 33284 (-0.37%); split: -0.38%, +0.01%
Copies: 67440 -> 65992 (-2.15%); split: -2.15%, +0.00%
Branches: 18498 -> 18465 (-0.18%)
PreSGPRs: 38409 -> 38331 (-0.20%)
PreVGPRs: 44089 -> 43834 (-0.58%)

Note, some fossils are from before this pattern was added to VKD3D-Proton,
so the above may not reflect real-world impact.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15838>
2023-01-26 12:24:24 +00:00
Timur Kristóf
7985933a6d nir: Lower pack_half_2x16_split to RTZ if available.
Constant folding always uses RTNE for pack_half_2x16_split, but some
backends implement it with RTZ.

Lowering to RTZ when available ensures that the behaviour will be
consistent between constant folding and the backend.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15838>
2023-01-26 12:24:24 +00:00
Timur Kristóf
12652cc549 nir: Add pack_half_2x16_rtz_split opcode.
Same as pack_half_2x16_rtz_split, but always uses RTZ mode.
Note that pack_half_2x16 rounding mode is unspecified.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15838>
2023-01-26 12:24:24 +00:00
Lionel Landwerlin
ff34e96701 nir/lower_io: fix bounds checking for 64bit_bounded_global
If the offset is negative like it's the case in

dEQP-VK.robustness.robustness2.bind.notemplate.r32i.unroll.volatile.storage_buffer_dynamic.readwrite.no_fmt_qual.len_256.samples_1.1d.comp

we end up passing the bounds checking condition because it's using
signed integers.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Suggested-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20762>
2023-01-19 09:16:40 +00:00
Alyssa Rosenzweig
e664082d35 nir/lower_blend: No-op nir_color_mask if no mask
In this usual case, do a quick check to avoid generating 5 useless instructions
(mov/vec4 instructions). They'll get copypropped but that creates more work for
the optimizer and nir/lower_blend runs in a hot variant path on both Asahi and
Panfrost.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20016>
2023-01-19 04:09:17 +00:00
Alyssa Rosenzweig
1fc25c8c79 nir/lower_blend: Handle undefs in stores
nir/lower_blend asserts:

   assert(nir_intrinsic_write_mask(store) ==
          nir_component_mask(store->num_components));

For the special blend shaders used in Panfrost, this holds. But for arbitrary
shaders coming out of GLSL-to-NIR (as used with Asahi), this does not hold. In
particular, after nir_opt_undef runs, undefined components can be trimmed.
Concretely, if we have the shader:

    gl_FragColor.xyz = foo;

Then this becomes in NIR

   gl_FragColor = vec4(foo.xyz, undef);

and then opt_undef will give the store_deref a wrmask of xyz but 4 components.
Then lower_blend asserts out.

Found in a gfxbench shader on asahi.

Closes: #6982
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20016>
2023-01-19 04:09:17 +00:00
Alyssa Rosenzweig
8b83210ab3 nir/lower_blend: Don't do logic ops on pure float
Per the spec.

Fixes arb_color_buffer_float-render on both Panfrost and Asahi (before/after
reproduced on Mali-T860 and AGX G13 respectively). Without that patch, that test
fails the assertion:

arb_color_buffer_float-render: ../src/compiler/nir/nir_lower_blend.c:259: nir_blend_logicop: Assertion `util_format_is_pure_integer(format)' failed.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20016>
2023-01-19 04:09:17 +00:00
Alyssa Rosenzweig
dbd0615e7a nir/lower_blend: Avoid useless iand with logic ops
The upper bits start correctly, there's no need to clear them as long as we keep
them zero'ed by using ixor with a valid bit mask instead of inot.

Makes the code generated for logic op slightly less ridiculous. I'm joking. It's
still ridiculous but I'm not in the mood to fix up the Midgard compiler and it's
just a little ALU for a feature almost nothing uses.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20016>
2023-01-19 04:09:17 +00:00
Alyssa Rosenzweig
ee127f03e4 nir/lower_blend: Fix SNORM logic ops
We need to sign extend. Incidentally this means the iand above is useless for
SNORM.

Fixes arb_color_buffer_float-render with GL_RGBA8_SNORM.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20016>
2023-01-19 04:09:17 +00:00
Alyssa Rosenzweig
f9839e7e1b nir/lower_blend: Clamp blend factors
Particularly constant colours, but also (more obscurely) SNORM.

Fixes arb_color_buffer_float-render with SNORM framebuffers. Issue affects both
Asahi and Panfrost (the latter after we start advertising EXT_render_snorm).

v2: Check the blend factor to avoid unnecessary clamps. This avoids regressing
blend shader code quality on Panfrost.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com> [v1]
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20016>
2023-01-19 04:09:17 +00:00
Alyssa Rosenzweig
fca457790e nir/lower_blend: Fix alpha=1 for RGBX format
In this case we have 4 components but the value of the fourth component
is undefined. Apply the fixup we already have.

Fixes
dEQP-GLES3.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.0
on Asahi. That test blend with DST_ALPHA with its RGB565 attachment,
which is fine if RGB565 is preserved, but Asahi is demoting that
format to RGBX8 which means -- after lowering the tilebuffer access --
we blend with an ssa_undef.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20016>
2023-01-19 04:09:17 +00:00
Lionel Landwerlin
b82d9b1a3d nir/divergence: add missing RT intrinsinc handling
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20763>
2023-01-18 22:32:43 +00:00