This value is intended to be used to remove out of bounds array
access when unrolling loops so it should contain the comparison
that contains the the induction variable not the overall
condition of the loop terminator. So here we update the instruction
when dealing with iand/ior loop terminator conditions.
Acked-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28998>
nir_lower_frag_coord_to_pixel_coord was adding .5 to work around that the
drivers were mistakenly setting PIXEL_COORD_HALF_INTEGER. With the
setting corrected, the GL frontend handles it appropriately (instead of
subtracting half in the frontend for ARB_fragment_coord_conventions
integer setting and then adding the half back here), and makes the pass
reusable from Intel.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29585>
This tests usually takes around 15 seconds on CI, according to logs. It
recently timed out under load, causing a job to fail spuriously.
Let's bump the timeout here to 60. That's in line with the glsl compiler
warnings test, which usually takes around the same time.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29547>
Looking at each usage of DETECT_OS_UNIX, it's more about the POSIX API usage, not the
Unix-like OS, so let's rename it
And for POSIX it's a standard to claim which API present, but for UNIX there is no such thing
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29555>
v2: Add some comments explaining some of the nuance of the shift
optimizations. Fix a bug in the shift count calculation of the upper
32-bits. Move the @64 from the variable to the opcode. All suggested
by Jordan.
No shader-db changes on any Intel platform.
fossil-db:
Meteor Lake and DG2 had similar results. (Meteor Lake shown)
Totals:
Instrs: 154507026 -> 154506576 (-0.00%)
Cycle count: 17436298868 -> 17436295016 (-0.00%)
Max live registers: 32635309 -> 32635297 (-0.00%)
Totals from 42 (0.01% of 632575) affected shaders:
Instrs: 5616 -> 5166 (-8.01%)
Cycle count: 133680 -> 129828 (-2.88%)
Max live registers: 1158 -> 1146 (-1.04%)
No fossil-db changes on any other Intel platform.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29148>
I noticed some shaders with patterns similar to these while working on
cooperative matrix lowering.
Meteor Lake and DG2 are the only platforms that support iadd3, so there
were no shader-db or fossil-db changes on any other platforms.
shader-db:
Meteor Lake and DG2 had similar results. (Meteor Lake shown)
total instructions in shared programs: 19869445 -> 19868343 (<.01%)
instructions in affected programs: 419426 -> 418324 (-0.26%)
helped: 913 / HURT: 2
total cycles in shared programs: 936010029 -> 935909811 (-0.01%)
cycles in affected programs: 31746523 -> 31646305 (-0.32%)
helped: 495 / HURT: 356
LOST: 10
GAINED: 12
fossil-db:
Meteor Lake and DG2 had similar results. (Meteor Lake shown)
Totals:
Instrs: 154514596 -> 154505466 (-0.01%); split: -0.01%, +0.00%
Cycle count: 17540226067 -> 17436266198 (-0.59%); split: -0.63%, +0.04%
Spill count: 146887 -> 146886 (-0.00%)
Fill count: 272499 -> 272489 (-0.00%); split: -0.01%, +0.00%
Max live registers: 32634290 -> 32634739 (+0.00%); split: -0.00%, +0.00%
Max dispatch width: 5550128 -> 5550368 (+0.00%)
Totals from 4401 (0.70% of 632560) affected shaders:
Instrs: 3095239 -> 3086109 (-0.29%); split: -0.30%, +0.00%
Cycle count: 7327352564 -> 7223392695 (-1.42%); split: -1.51%, +0.10%
Spill count: 28105 -> 28104 (-0.00%)
Fill count: 45830 -> 45820 (-0.02%); split: -0.04%, +0.02%
Max live registers: 264376 -> 264825 (+0.17%); split: -0.05%, +0.22%
Max dispatch width: 43768 -> 44008 (+0.55%)
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29148>
This allows us to not generate 64-bit iadd3 on Intel but continue
generating it for NVIDIA.
No shader-db or fossil-db changes.
v2: Add nir_lower_iadd3_64 flag so we can continue to generate 64-bit
iadd3 on NVIDIA platforms.
v3: s/bit_size == 64/s == 64/. This cut-and-paste bug prevented any of
the optimizations from ever occuring.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29148>
It doesn't make sense to have two sets of opcodes for this when all backends
that support the flush_to_zero variant just rely on the global floating point
mode anyway.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29433>
For drivers that don't lower mediump shader inputs / outputs
to 16-bit, it's better to ignore the mediump flag completely,
letting mediump inputs / outputs work like normal 32-bit IO.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29435>
qsort is not guaranteed to produce a stable sort, and indeed
in MSVC CRT does not. The xfb varying sort functions were
relying on undefined behavior (that qsort would be stable).
Signed-off-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29178>
In the sort functions used to sort varyings in gl_nir_link_varyings,
we were only checking the first input for whether or not it is xfb.
Check both inputs, and also provide a definite order for the xfb vs.
non-xfb varyings (the xfb come last, as the initial sort established).
This fixes a problem encountered on panfrost, where qsort could
mix xfb and non-xfb varyings which started out separate.
Note that the sort is still not stable. We probably should make it
stable, but that is a more extensive change that's handled in a later
commit.
Cc: mesa-stable
Signed-off-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29178>
Dumps the value associated with each SPIR-V ID after parsing the module.
This will show the intermediate vtn_* values that spirv_to_nir uses.
Only a subset of detailed information is printed at the moment (focus on
pointers and pointer types), but it is easy to add for other value types
later.
Example output when running crucible with the debug option enabled.
```
crucible: start : func.compute.num-workgroups.basic.q0
=== SPIR-V values
1 = extension
2 = type void glsl_type=void
3 = type function
4 = function
5 = block
6 = type scalar glsl_type=uint
7 = type vector glsl_type=uvec3
8 = type array glsl_type=uvec3[]
9 = type struct glsl_type=Storage
10 = type pointer deref=9 SpvStorageClassUniform glsl_type=uvec4
11 = pointer ptr_type=10 (pointed-)type=9
12 = type scalar glsl_type=int
13 = constant type=12
14 = type pointer deref=7 SpvStorageClassInput glsl_type=uint
15 = pointer ptr_type=14 (pointed-)type=7
16 = constant type=6
17 = type pointer deref=6 SpvStorageClassInput glsl_type=uint
18 = pointer ptr_type=17 (pointed-)type=6
NIR: 32 %2 = deref_array &(*%0)[0] (system uint) // &gl_LocalInvocationID[0]
19 = ssa glsl_type=uint
20 = pointer ptr_type=14 (pointed-)type=7
21 = ssa glsl_type=uvec3
22 = type pointer deref=7 SpvStorageClassUniform glsl_type=uint
23 = pointer ptr_type=22 (pointed-)type=7
NIR: 32x4 %12 = deref_array &(*%11)[%4] (ssbo uvec3) // &((Storage *)%9)->uv3a[%4]
24 = constant type=6
25 = constant type=6
26 = constant type=7
===
crucible: pass : func.compute.num-workgroups.basic.q0
```
When the environment variable is set, this dump will also be printed
during vtn_fail and its helpers.
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29295>
Promoting flat inputs should only happen while assigning FS input
slot groups. Otherwise we risk adding extra input slots, which
is undesireable.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29208>
New pass that shares code with nir_opt_load_store_vectorize but
it only updates the alignment of load/store instructions.
It is useful before running other passes which may
potentially destroy that information (eg. by removing some
instructions from which the alignment may be deduced).
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29210>
We'll use the delta for an upcoming internal printf mechanism, where
the PARAM_IDX will be the base printf reloc identifier and the BASE
will be the string id.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25814>