Commit graph

7577 commits

Author SHA1 Message Date
Marek Olšák
e49c1719e0 nir: fix 2 bugs in nir_create_passthrough_tcs
- VAR31 was ignored.
- Only a half of the 16-bit slot was passed through, though I'm not sure
  if nir_lower_io handles vec8. The slots are only for GLES and I don't
  think a passthrough TCS is possible with GLES.

Fixes: a8e84f50bc - nir: Add helper to create passthrough TCS shader

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21861>
(cherry picked from commit ace8a7068e)
2023-04-26 09:43:12 -07:00
Mike Blumenkrantz
c67e28251e nir/lower_alpha_test: rzalloc state slots
this otherwise leads to uninitialized memory

cc: mesa-stable

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22558>
(cherry picked from commit 24555f5462)
2023-04-26 09:43:09 -07:00
Eric Engestrom
47b2402bdc compiler: fix buggy usage of unreachable()
Cc: mesa-stable
Signed-off-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22529>
(cherry picked from commit f5ed1c79ae)
2023-04-25 09:52:21 -07:00
antonino
b96362381b nir: avoid generating conflicting output variables
Because not all vertex outputs can have corresponding fragment inputs
(eg. edgeflags) some logic is needed to correctly generate variables in
a passthough gs.

Before this change some output variables ened up with the same location.

Fixes: d0342e28b3 ("nir: Add helper to create passthrough GS shader")

Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21238>
(cherry picked from commit edecb66b01)
2023-04-04 15:56:12 -07:00
antonino
a0224f1b3a nir: handle primitives with adjacency
`nir_create_passthrough_gs` can now handle primitives with adjacency where some
vertices need to be skipped.

Fixes: d0342e28b3 ("nir: Add helper to create passthrough GS shader")
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21238>
(cherry picked from commit ea14579f3d)
2023-04-04 15:56:12 -07:00
Konstantin Seurer
95d24c496c nir/lower_shader_calls: Remat derefs before lowering resumes
Closes: #7923
cc: mesa-stable

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20399>
(cherry picked from commit 200e551cbb)
2023-04-04 13:06:00 -07:00
Jesse Natalie
682b695e4a nir: Propagate alignment when rematerializing cast derefs
Fixes: 878a8daca6 ("nir: Add alignment information to cast derefs")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21975>
(cherry picked from commit 49885f87c3)
2023-03-29 14:56:06 -07:00
Sviatoslav Peleshko
5fe3b05f98 glsl: Fix codegen for constant ir_binop_{l,r}shift with mixed types
Fixes: 13106e10 ("glsl: Generate code for constant ir_binop_lshift and ir_binop_rshift expressions")

Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17787>
(cherry picked from commit 1648e3b4b9)
2023-03-28 13:46:18 -07:00
Timothy Arceri
530c9bdce5 glsl: allow 64-bit integer on RHS of shift
Fixes: 9ba9a7f854 ("glsl: Add 64-bit integer support to some operations.")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6862

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21919>
(cherry picked from commit 174d6e6a54)
2023-03-28 13:46:13 -07:00
Isabella Basso
f2ea143171 nir/algebraic: remove duplicate bool conversion lowerings
While [1] added some boolean conversion lowering patterns, those were
already dealt with on [2].

[1] - b86305bb ("nir/algebraic: collapse conversion opcodes (many patterns)")
[2] - d7e0d47b ("nir/algebraic: nir: Add a bunch of b2[if] optimizations")

Fixes: b86305bb ("nir/algebraic: collapse conversion opcodes (many patterns)")
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Signed-off-by: Isabella Basso <isabellabdoamaral@usp.br>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20965>
(cherry picked from commit 59fea8af3a)
2023-03-27 14:06:51 -07:00
Isabella Basso
96a918299e nir/algebraic: insert patterns inside optimizations list
Some patterns were outside the list of optimizations.

Fixes: b86305bb ("nir/algebraic: collapse conversion opcodes (many patterns)")

Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Signed-off-by: Isabella Basso <isabellabdoamaral@usp.br>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20965>
(cherry picked from commit b3685f3ba7)
2023-03-27 09:49:51 -07:00
Lionel Landwerlin
a0b65ff670 nir: fix nir_ishl_imm
Both GLSL & SPIRV have undefined values for shift > bitsize. But SM5
says :

   "This instruction performs a component-wise shift of each 32-bit
    value in src0 left by an unsigned integer bit count provided by
    the LSB 5 bits (0-31 range) in src1, inserting 0."

Better to not hard code the wrong behavior in NIR.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: e227bb9fd5 ("nir/builder: add ishl_imm helper")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@colllabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21720>
(cherry picked from commit a278eeb719)
2023-03-20 18:00:20 -07:00
Marek Olšák
45fab91421 nir: lower to fragment_mask_fetch/load_amd with EQAA correctly
Fixes: 194add2c23 ("nir: lower image add lower_to_fragment_mask_load_amd option")
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21436>
(cherry picked from commit 0c8e7ad47e)
2023-02-27 13:50:09 -08:00
Karol Herbst
8b9b246a2f nir/deref: don't replace casts with deref_struct if we'd lose the stride
The result might be used in a deref_ptr_as_array, which requires a proper
stride within lower_explicit_io. If we'd lose that information or end up
with a different stride don't execute this optimization.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8289
Fixes: b779baa9bf ("nir/deref: fix struct wrapper casts. (v3)")
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21458>
(cherry picked from commit 56a9aad401)
2023-02-22 19:12:06 -08:00
Faith Ekstrand
785cf85296 nir/deref: Preserve alignments in opt_remove_cast_cast()
This also removes the loop so opt_remove_cast_cast() will only optimize
cast(cast(x)) and not cast(cast(cast(x))).  However, since nir_opt_deref
walks instructions top-down, there will almost never be a tripple cast
because the parent cast will have opt_remove_cast_cast() run on it.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
(cherry picked from commit af9212dd82)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21467>
2023-02-22 23:33:44 +00:00
Jesse Natalie
d55de21f0f clc: Include opencl-c-base.h with LLVM 15 (using builtins)
Reviewed-by: Karol Herbst <kherbst@redhat.com>
(cherry picked from commit b27d8ee2e9)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21467>
2023-02-22 23:33:44 +00:00
Timothy Arceri
09c6627fa3 glsl: isolate object macro replacments
Here we use a leading space to isolate them from
the code they will be inserted into. For example:

    #define VALUE -1.0
    int a = -VALUE;

Should be evaluated to int a = - -1.0; not int a = --1.0;

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7932

Cc: mesa-stable
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21352>
(cherry picked from commit 3a9edfc494)
2023-02-22 10:12:08 -08:00
Timothy Arceri
34170edbb6 glsl: add _token_list_prepend() helper to the parser
This will be used in the following patch.

Cc: mesa-stable
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21352>
(cherry picked from commit 6e29dce291)
2023-02-22 10:12:07 -08:00
Faith Ekstrand
07b9046128 nir/from_ssa: Move the loop bounds check in resolve_parallel_copy
We loop, effectively, over two stacks: ready and to_do and finish only
when both are empty.  In the case where ready is empty, we pull one off
of to_do, add a copy to a temporary, and push it onto the ready stack.
Previously, we assumed that we would never get to the temporary copy
case if to_do has exactly one entry because that would imply that there
was only one copy left which means there can't possibly be a cycle to
break.  This was true until c7fc44f9eb ("nir/from_ssa: Respect and
populate divergence information") which changed things such that
temporary copies sometimes get added in the case where a convergent
value is copied both to convergent and divergent destinations.

This patch adjusts our loop iteration to always attempt to clear the
ready stack before checking if there's anything left on the to_do stack.
I also added an assert to make the exit condition more clear.

Fixes: c7fc44f9eb ("nir/from_ssa: Respect and populate divergence information")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8037
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21315>
(cherry picked from commit 4e09d37f3b)
2023-02-16 15:17:17 -08:00
Faith Ekstrand
8eb3e6d1a3 nir/from_ssa: Only re-locate values that are destinations
There is an optimization in the parallel copy algorithm where, after a
copy has been performed, we can treat the destination as the new source
for future copies of the same source.  In particular, consider the
following parallel copy: A -> B, C -> A, A -> C.  In this case, after we
have done the A -> B copy, we can make note that the value in A is now
in B and emit the sequence: A -> B, C -> A, B -> C.  This allows us to
resolve the swap cycle between A anc C without allocating a temporary
register because we know B is also a copy of A.

When one of the registers involved is convergent and the other is
divergent, this optimization is problematic because, while convergent to
divergent copies are fine, we can't re-use the divergent copy in later
copies if any of those copies are to a convergent variable.  We could,
but it would require a read_first_invocation which would get messy.  In
In c7fc44f9eb ("nir/from_ssa: Respect and populate divergence
information"), we attempted to deal with this by limiting the rename
optimization to the case where the divergence matched.

The problem is that we did the re-name part whenever the divergence
matched but only marked it as ready if the thing being copied was a
destination.  (We actually left two instances of loc[a] = b, one which
always happened and one which only happened if we also wanted to flag
the source as being ready to use as a destination.)  While this
technically doesn't cause any problems, it may result in more inter-mov
dependencies which hurts instruction scheduling.  For example, if we had
the parallel copy A -> B, A -> C, A -> D, we now end up emitting the
sequence A -> B, B -> C, C -> D which has many more data hazards between
instructions caused by the constant shuffling.

This commit restores the original logic in which we only perform the
rename optimization if the rename would free up a register we will later
use as a destination.  This isn't entirely optimal as it still doesn't
prove that there is a cycle involved first, but it should lead to a
reduction in unnecessary dependencies.

No shader-db changes on SKL or DG2

Fixes: c7fc44f9eb ("nir/from_ssa: Respect and populate divergence information")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21315>
(cherry picked from commit 5afba073c6)
2023-02-16 15:17:17 -08:00
Michel Dänzer
67b957b0a4 glsl/standalone: Do not pass memory allocated with ralloc_size to free
Pointed out by GCC:

In function ‘load_text_file’,
    inlined from ‘standalone_compile_shader’ at ../src/compiler/glsl/standalone.cpp:491:38,
    inlined from ‘main’ at ../src/compiler/glsl/main.cpp:98:45:
../src/compiler/glsl/standalone.cpp:358:17: error: ‘free’ called on pointer ‘block_195’ with nonzero offset 48 [-Werror=free-nonheap-object]
  358 |             free(text);
      |                 ^
In function ‘ralloc_size’,
    inlined from ‘load_text_file’ at ../src/compiler/glsl/standalone.cpp:352:31,
    inlined from ‘standalone_compile_shader’ at ../src/compiler/glsl/standalone.cpp:491:38,
    inlined from ‘main’ at ../src/compiler/glsl/main.cpp:98:45:
../src/util/ralloc.c:117:18: note: returned from ‘malloc’
  117 |    void *block = malloc(align64(size + sizeof(ralloc_header),
      |                  ^

Fixes: a9696e79fb ("main: Close memory leak of shader string from load_text_file.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21215>
(cherry picked from commit 49a6bdde8e)
2023-02-16 09:10:45 -08:00
Michel Dänzer
f34f0a0c20 glsl/standalone: Fix up _mesa_reference_shader_program_data signature
Drop the unused ctx parameter, to match the main Mesa code.

Fixes ODR violation flagged by -Wodr with LTO enabled:

../src/mesa/main/shaderobj.h:74:1: error: ‘_mesa_reference_shader_program_data’ violates the C++ One Definition Rule [-Werror=odr]
   74 | _mesa_reference_shader_program_data(struct gl_shader_program_data **ptr,
      | ^
../src/compiler/glsl/standalone_scaffolding.cpp:76:1: note: type mismatch in parameter 1
   76 | _mesa_reference_shader_program_data(struct gl_context *ctx,
      | ^
../src/compiler/glsl/standalone_scaffolding.cpp:76:1: note: ‘_mesa_reference_shader_program_data’ was previously declared here
../src/compiler/glsl/standalone_scaffolding.cpp:76:1: note: code may be misoptimized unless ‘-fno-strict-aliasing’ is used

Fixes: 717a720e9c ("mesa: drop unused context parameter to shader program data reference.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21215>
(cherry picked from commit bf67f32d4b)
2023-02-16 09:10:45 -08:00
Bas Nieuwenhuizen
0e24e133d9 nir: Apply a maximum stack depth to avoid stack overflows.
A stackless (or at least using allocated memory for stack) version
might be nice but for now this works around some games compiling
large shaders and hitting stack overflows.

CC: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21231>
(cherry picked from commit 0a17c3afc5)
2023-02-16 09:10:45 -08:00
Pavel Ondračka
8bb100fc02 nir: mark progress when removing trailing unused load_const channels
When the unused channels were at the end and so no reswizzling was
needed, we wouldn't correctly mark the progress.

Fixes: 3305c960
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21014>
(cherry picked from commit 7e6acfd587)
2023-02-01 13:48:45 -08:00
Pavel Ondračka
23336938ca nir: mark progress when removing trailing unused alu channels
When the unused channels were at the end and so no reswizzling was
needed, we wouldn't correctly mark the progress.

Fixes: cb7f2012
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21014>
(cherry picked from commit fe56dd9c42)
2023-02-01 13:48:44 -08:00
Lionel Landwerlin
c90f932257 nir/lower_io: fix bounds checking for 64bit_bounded_global
If the offset is negative like it's the case in

dEQP-VK.robustness.robustness2.bind.notemplate.r32i.unroll.volatile.storage_buffer_dynamic.readwrite.no_fmt_qual.len_256.samples_1.1d.comp

we end up passing the bounds checking condition because it's using
signed integers.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Suggested-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20762>
(cherry picked from commit ff34e96701)
2023-01-24 15:27:04 -08:00
Lionel Landwerlin
5c3cd5da22 nir/divergence: add missing RT intrinsinc handling
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20763>
(cherry picked from commit b82d9b1a3d)
2023-01-24 15:27:03 -08:00
t0b3
54cfb552ab nir/nir_opt_move: fix ALWAYS_INLINE compiler error
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Closes: #6825
Fixes: f1d20ec6 ("nir/nir_opt_move: handle non-SSA defs ")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17439>
(cherry picked from commit 267dd1f4d5)
2023-01-13 13:52:42 -08:00
Alyssa Rosenzweig
2976548e4a nir/gather_info: Handle store_zs_agx
This acts as a depth/stencil write. The AGX compiler checks outputs_written to
determine what conservative depth settings the driver needs. Nominally, this
should work: the original store_output(FRAG_RESULT_DEPTH) intrinsic causes the
DEPTH outputs_written bit to be set, so the metadata is still correct after
lowering store_output to store_zs_agx. However, there are a handful of places
that call nir_gather_info late, which *resets* the existing outputs_written
value and regathers, causing Asahi to use the wrong conservative depth settings
when shuffling NIR pass order and breaking gl_FragDepth.

To fix, handle store_zs_agx conservatively when gathering info so we don't have
to play games with the pass order or stashing info in a sideband.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20563>
2023-01-11 21:14:20 +00:00
Alyssa Rosenzweig
cc5ca8164d nir: Add store_agx intrinsic
This works like store_global, but lets us optimize address arithmetic. Like
load_agx, it is formatted to match the hardware semantic. We don't make use of
any clever formats in this series, though.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20558>
2023-01-11 20:36:51 +00:00
Rob Clark
5fb0992a53 mesa/st: Track complete access qualifier for images
Don't turn gl_access_qualifier coming from NIR back into GL enums,
losing information in the process.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20612>
2023-01-11 20:09:01 +00:00
Jesse Natalie
b0f3a387c9 nir_lower_fragcoord_wtrans: Support Vulkan shaders
In Vulkan shaders, you might not have all derefs pointing to a variable

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20400>
2023-01-10 04:25:26 +00:00
Timothy Arceri
ac5af6c06d util/driconf: add Dune: Spice Wars workaround
As per the bug report the game does not correctly handle a uniform
index of -1 being returned for the unused array element, which
results in rendering issues. So here we skip the uniform array
resizing optimisation.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6397
Cc: mesa-stable

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20579>
2023-01-10 03:53:19 +00:00
Mary
d8e5714e81 isaspec: Fix bitmask conversions when isa.bitsize < 64
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20541>
2023-01-07 00:14:10 +01:00
Jason Ekstrand
39c6f6454c isaspec: Give decode.c/h more descriptive names
Because these are being included across subdir boundaries, the name
"decode" is potentially pretty overloaded.  Instead, prefix them with
"isaspec_".  Also, since they're both weird includes now and not really
complete files in their own right, give them a descriptive suffix.

Acked-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20525>
2023-01-05 18:21:02 +00:00
Jason Ekstrand
e8945a8ce6 isaspec: Stop depending on glue headers and out-of-folder C files
The way the isaspec decoder used to work was that it would generate a
header and a C file, each with ISA-specific stuff in it. Then that would
get built together with a stand-alone decode.c file which lives in the
isaspec folder, not the driver's folder.  In order for decode.c to find
the ISA-specific headers, it would also generate a glue header which had
to be named isaspec-isa.h.  This effectively meant that you can't have
multiple isaspec definitions in the same folder.

To solve this, we make do it the other way around and make the generated
header and C files include the stand-alone files.  This is a bit awkward
because it means including a C file from another C file but it's better
for the build system.

Acked-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20525>
2023-01-05 18:21:02 +00:00
Jason Ekstrand
4953a8db25 isaspec: Use argparse
This also cleans up some of our python script execution conventions and
handles mako errors better. Copied a bit from vk_entrypoints_gen.py.

Acked-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20525>
2023-01-05 18:21:02 +00:00
Jason Ekstrand
e83ad77ef5 isaspec: Stop using s and xml from the global namespace
We really shouldn't rely on these being global variables.  Pass them
along instead.

Acked-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20525>
2023-01-05 18:21:02 +00:00
Pavel Ondračka
a93bc6afc4 nir: check for x - ffract(x) patterns when lowering f2i32
We already skip emitting ftrunc in nir_lower_int_to_float when there is
ffloor, fround or any other integer-making opcode preceding f2i32. However
if lower_ffloor is set for driver that doesn't support integers, the lowered
x - ffract(x) patterns would not be recognized and extra ftruct would be
emitted, doing unnecessary rounding.

This optimization only works if there is no non-trivial swizzling used for
the fadd, fneg and ffract involved, which seems to be 99% of the cases according
to my testing.

This is needed to enable nir ffloor lowering on r300 driver without regressions.

I'm not sure if this helps anybody else, the only hardware which sets
lower_ffloor and converts ints to floats (and can't do trunc) are some old
etnaviv cards, so maybe it will help there a bit.

Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20208>
2023-01-05 12:01:32 +00:00
Qiang Yu
cf2ea3fce9 nir/xfb: save high_16bits output info
It is combined with slot location to identify a varying
when using VARYING_SLOT_VARx_16BIT.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20157>
2023-01-05 01:12:06 +00:00
Ian Romanick
043508d8f8 glsl: Remove bit_count lowering
As far as I can tell, every driver that supports GLSL 1.30 or
GL_EXT_gpu_shader4 (and therefore also enables support for
GL_MESA_shader_integer_functions) also sets the NIR lower_bit_count
flag.

Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20323>
2023-01-03 18:37:53 -08:00
Ian Romanick
abe5acf7fd glsl: Remove bitfield_reverse lowering
As far as I can tell, every driver that supports GLSL 1.30 or
GL_EXT_gpu_shader4 (and therefore also enables support for
GL_MESA_shader_integer_functions) also sets the NIR
lower_bitfield_reverse flag.

Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20323>
2023-01-03 18:37:53 -08:00
Ian Romanick
f5722c4973 glsl: Remove bitfield_extract and bitfield_insert lowering
As far as I can tell, every driver that supports GLSL 1.30 or
GL_EXT_gpu_shader4 (and therefore also enables support for
GL_MESA_shader_integer_functions) also sets some subset of the various
NIR lower_bitfield_extract and lower_bitfield_insert flags.

v2: Declaration of 'result' still needs to be added to the IR. Noticed
by marge.

v3: Fix 'git rebase --autosquash' putting the v2 fix in the wrong
place. I've never seen that happen before. :(

Reviewed-by: Emma Anholt <emma@anholt.net> [v1]
Reviewed-by: Matt Turner <mattst88@gmail.com> [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20323>
2023-01-03 18:37:53 -08:00
Ian Romanick
db241fbd70 nir: Don't allow conflicting bitfield lowering passes
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20323>
2023-01-03 18:37:53 -08:00
Pavel Ondračka
53d9b696e4 nir: basic tests for nir_opt_shrink_vectors
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20213>
2023-01-03 12:32:33 +01:00
Pavel Ondračka
3305c9602d nir: fix shrinking of load_const for large vectors
Specifically when shrinking load_const with number of components
> 5, if the final number of components is not allowed (for example 8->6)
it would report false for progress even if we actually did some
reshuffling and also it would skip on the rewrite of the readers.

Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20213>
2023-01-03 12:32:33 +01:00
Pavel Ondračka
cb7f201288 nir: remove duplicate alu channels in nir_opt_shrink_vectors
This will clean code like:
   vec3 32 ssa_8 = frcp ssa_7.www
   vec3 32 ssa_9 = fmul ssa_7.xyz, ssa_8
into
   vec1 32 ssa_8 = frcp ssa_7.w
   vec3 32 ssa_9 = fmul ssa_7.xyz, ssa_8.xxx

This helps r300 driver because we can only do single channel for math
ops at a time, so the first version would result in three frcp
instructions. The nir_opt_shrink_vectors comments even claim the pass
should be doing this, however it actually does it only for nir_op_vecx
instructions, so extend this for generic alu instructions.

RV530 shader-db:
total instructions in shared programs: 135032 -> 133707 (-0.98%)
instructions in affected programs: 46121 -> 44796 (-2.87%)
helped: 452
HURT: 26
total temps in shared programs: 17051 -> 17033 (-0.11%)
temps in affected programs: 1509 -> 1491 (-1.19%)
helped: 91
HURT: 30

12.02->12.08 (+0.5%) fps gain in Unigine Sanctuary (n=5) with RV530

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7051
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reiewed-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20213>
2023-01-03 12:32:33 +01:00
Christian Gmeiner
9e56f69edf isaspec: encode: handle special fieldname properties
Without this change a fieldname like '{DST::align=12}' was not used for
encoding. Change the regex to include such fieldnames and remove the fieldname
property in a later step.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20462>
2022-12-31 13:43:15 +00:00
Danylo Piliaiev
1c9ee30838 nir/fold_16bit_tex_image: Add type granularity for dst folding
Some HW may be able to fold only some of dst types, e.g.
for Adreno folding i32 -> i16 could cause a different result since
folded variant clamps the result instead of masking it.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20396>
2022-12-23 15:48:18 +01:00
Lionel Landwerlin
3af08b9c30 nir/divergence: handle shader_record_ptr intrinsic
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes 6b8fd65e84 ("spirv: Implement the new ray-tracing storage classes")

Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20413>
2022-12-23 09:22:13 +00:00