Commit graph

150141 commits

Author SHA1 Message Date
Ilia Mirkin
40468430a4 isaspec: fix gen_max to be 2^32-1
The minus sign has higher preference than shift:

>>> 1 << 32 - 1
2147483648
>>> (1 << 32) - 1
4294967295

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14789>
2022-02-12 13:45:57 -05:00
Alyssa Rosenzweig
9dc30f99ae panfrost: Flesh out the Shader Program Descriptor
Only breaking change since Bifrost is that the shader contains barrier? flag is
now fragment-only, meaning it is just a spawn helper threads flag. This affects
compute shaders slightly.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15003>
2022-02-12 09:32:55 -05:00
Alyssa Rosenzweig
60b37424d9 panfrost: Simplify Valhall preload descriptor
Honestly, we could stand to do the same to Bifrost...

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15003>
2022-02-12 09:32:55 -05:00
Alyssa Rosenzweig
1e9a35648a panfrost: Clarify unknowns in z/stencil descriptor
Depth culling and clamping.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15003>
2022-02-12 09:32:55 -05:00
Alyssa Rosenzweig
733d5f061d panfrost: Add more fields to Attribute Descriptor
More XML

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15003>
2022-02-12 09:32:35 -05:00
Alyssa Rosenzweig
b31f6a821d panfrost: Update primitive descriptor for Valhall
Contains stuff needed for layered rendering. Unfortunately, there's no more
provoking vertex per draw -- ugh! That's fine for Vulkan (just don't set
provokingVertexModePerPipeline), but requires inserting extra flushes on desktop
OpenGL.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15003>
2022-02-12 09:32:18 -05:00
Bas Nieuwenhuizen
d8d32cf773 radv: Only wait on CS/PS to finish if we wait on a semaphore.
I think plain submission doesn't need it.

Reviewed-By: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14574>
2022-02-11 23:45:24 +00:00
Bas Nieuwenhuizen
79131b6ee6 radv: Fix preamble argument order.
Used the wrong cmdbuffer in the wrong situation. Oops.

Fixes: 915e9178fa ("radv: Split out commandbuffer submission.")
Reviewed-By: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14574>
2022-02-11 23:45:24 +00:00
Bas Nieuwenhuizen
7adb3c0f7f radv: Use larger arena sizes.
For some games that take like 400 MiB of shader binaries, the
number of shader arenas ends up going >1500. Cut that down a bit
by using larger arenas.

8 MiB should still be decent with small BAR and should still cut
things down from ~1500 to ~50 buffers.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14591>
2022-02-11 23:20:21 +00:00
Erico Nunes
0f9756f480 lima/ppir: refactor bitcopy to use unsigned char
This code does not work as expected when built with clang and
-fstrict-aliasing.
Redefine it in unsigned char operations so that it does not
violate strict aliasing rules.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Cc: 22.0 <mesa-stable>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14894>
2022-02-11 21:55:10 +00:00
Erico Nunes
7297f931f0 lima/ppir: initialize slots array for dummy/undef
Some functions in ppir iterate the ppir_op_info slots arrays looking
for the PPIR_INSTR_SLOT_END token. The dummy/undef internal ops may
appear in the scheduling code and their slots arrays did not contain
that token, which could result in invalid array reads.
Reported by gcc -fsanitize=address.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Cc: 22.0 <mesa-stable>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14894>
2022-02-11 21:55:10 +00:00
Erico Nunes
5b15849366 lima/gpir: avoid invalid write in regalloc
Reported by gcc -fsanitize=address, sometimes gpir regalloc attempts to
handle an uninitialized node->value_reg (containing the value -1), which
results in an invalid array access.
Avoid it for now to prevent crashes, but more investigation may be
required later on.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Cc: 22.0 <mesa-stable>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14894>
2022-02-11 21:55:10 +00:00
Erico Nunes
58c619c3d6 lima: remove an unneeded lima_job_get assignment
Reported by scan-build.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14894>
2022-02-11 21:55:10 +00:00
Erico Nunes
116f01c853 lima: add some checks for potential null pointer dereference
scan-build complains about a potential null pointer dereference in
some places around the lima code.
None of those seem to be a real issue as of now, but let's add some
asserts to cover for that and clean up the warning list.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14894>
2022-02-11 21:55:10 +00:00
Erico Nunes
60888c95bd lima: fix warning of garbage value access
scan-build complains that an access of reg[j+1] in this code might
return garbage.
Let's take the chance to clean this open coded sorting code up and
just use qsort.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14894>
2022-02-11 21:55:10 +00:00
Erico Nunes
2d807979c9 lima/ppir: initialize spill_costs array in regalloc
Static analysis complains that spill_costs might be accessed in
non-initialized positions.
It does not seem to be an issue with the current code which initializes
it for every relevant register index, but we can also just initialize it
to not have to worry about that.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14894>
2022-02-11 21:55:10 +00:00
Erico Nunes
8a8f34c320 lima/ppir: avoid ppir_codegen_outmod implicit conversion
Fix some clang -Wenum-conversion warnings like:

  warning: implicit conversion from enumeration type 'ppir_outmod' to
  different enumeration type 'ppir_codegen_outmod' [-Wenum-conversion]

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14894>
2022-02-11 21:55:10 +00:00
Erico Nunes
3d77950b8b lima/ppir: clean up override-init warnings
Define ppir_op_unsupported as 0 so that we don't have to do the
initialization to -1.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14894>
2022-02-11 21:55:10 +00:00
Erico Nunes
823be63216 lima/gpir: clean up override-init warnings
Define gpir_op_unsupported as 0 so that we don't have to do the
initialization to -1.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14894>
2022-02-11 21:55:10 +00:00
Chia-I Wu
8a30b1541c venus: use 64KB alignment for suballocations
TGL CCS surface addresses must be aligned to 64KB.

Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15001>
2022-02-11 21:26:45 +00:00
Yiwei Zhang
05a2cea14c venus: no roundtrip needed for shmem backed by BLOB_MEM_HOST3D
A successful DRM_IOCTL_VIRTGPU_MAP on BLOB_MEM_HOST3D implies a
roundtrip.

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14658>
2022-02-11 21:16:42 +00:00
Yiwei Zhang
a76d1e0e74 venus: init renderer_info at renderer creation (part 2)
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14658>
2022-02-11 21:16:42 +00:00
Yiwei Zhang
ddba7337c7 venus: init renderer_info at renderer creation (part 1)
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14658>
2022-02-11 21:16:42 +00:00
Daniel Schürmann
1bbbabedb7 aco/insert_exec_mask: refactor and remove some unnecessary WQM handling code
Some cases cannot happen and don't need to be handled anymore.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951>
2022-02-11 19:05:30 +00:00
Daniel Schürmann
d7d7b9974a aco/insert_exec_mask: refactor and simplify get_block_needs()
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951>
2022-02-11 19:05:30 +00:00
Daniel Schürmann
fcc5dec8d6 aco/insert_exec_mask: remove ever_again_needs and Exact_Branch
This information is not required anymore.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951>
2022-02-11 19:05:30 +00:00
Daniel Schürmann
cbb1b095ca aco/insert_exec_mask: remove some unnecessary WQM loop handling code
These workarounds are were necessary to prevent infinite loops
with helper lane registers containing wrong data.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951>
2022-02-11 19:05:30 +00:00
Daniel Schürmann
580a63b4ac aco/insert_exec_mask: remove Preserve_WQM flag
If WQM is needed anywhere after discard_if(), it will also
be flagged as WQM. We can rely on that to preserve the WQM mask.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951>
2022-02-11 19:05:30 +00:00
Daniel Schürmann
a5a40beefa aco: don't emit WQM for bool_to_scalar_condition
This was only necessary to ensure that the source is computed in WQM
if the current exec mask is in WQM.

Totals from 23170 (17.17% of 134913) affected shaders: (GFX10.3)
VGPRs: 1384464 -> 1383400 (-0.08%); split: -0.08%, +0.01%
SpillSGPRs: 7575 -> 7574 (-0.01%)
CodeSize: 146444752 -> 146317104 (-0.09%); split: -0.13%, +0.04%
MaxWaves: 429870 -> 429868 (-0.00%)
Instrs: 27202586 -> 27170316 (-0.12%); split: -0.17%, +0.05%
Latency: 379488313 -> 379335412 (-0.04%); split: -0.07%, +0.03%
InvThroughput: 69500561 -> 69487704 (-0.02%); split: -0.03%, +0.01%
VClause: 473080 -> 473038 (-0.01%); split: -0.02%, +0.01%
SClause: 1080576 -> 1080571 (-0.00%); split: -0.06%, +0.06%
Copies: 1492865 -> 1504604 (+0.79%); split: -0.40%, +1.19%
Branches: 711295 -> 716849 (+0.78%); split: -0.01%, +0.79%
PreSGPRs: 1331362 -> 1328402 (-0.22%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951>
2022-02-11 19:05:30 +00:00
Daniel Schürmann
f816dd1be7 aco: don't propagate WQM for p_as_uniform
This was needed, so that in case of active helper lanes,
these contain the correct value. It is now handled implicitly.

Totals from 1004 (0.74% of 134913) affected shaders: (GFX10.3)
CodeSize: 7581020 -> 7580892 (-0.00%); split: -0.00%, +0.00%
Instrs: 1454940 -> 1454908 (-0.00%); split: -0.00%, +0.00%
Latency: 12984953 -> 12984894 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 3173037 -> 3173049 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 47498 -> 47273 (-0.47%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951>
2022-02-11 19:05:30 +00:00
Daniel Schürmann
825cd696dc aco/insert_exec_mask: stay in WQM while helper lanes are still needed
This patch flags all instructions WQM which don't require
Exact mode, but depend on the exec mask as long as WQM
is needed on any control flow path afterwards.
This will mostly prevent accidental copies of WQM values
within Exact mode, and also makes a lot of other workarounds
unnecessary.

Totals from 17374 (12.88% of 134913) affected shaders: (GFX10.3)
VGPRs: 526952 -> 527384 (+0.08%); split: -0.01%, +0.09%
CodeSize: 33740512 -> 33766636 (+0.08%); split: -0.06%, +0.14%
MaxWaves: 488166 -> 488108 (-0.01%); split: +0.00%, -0.02%
Instrs: 6254240 -> 6260557 (+0.10%); split: -0.08%, +0.18%
Latency: 66497580 -> 66463472 (-0.05%); split: -0.15%, +0.10%
InvThroughput: 13265741 -> 13264036 (-0.01%); split: -0.03%, +0.01%
VClause: 122962 -> 122975 (+0.01%); split: -0.01%, +0.02%
SClause: 334805 -> 334405 (-0.12%); split: -0.51%, +0.39%
Copies: 275728 -> 282341 (+2.40%); split: -0.91%, +3.31%
Branches: 92546 -> 90990 (-1.68%); split: -1.68%, +0.00%
PreSGPRs: 504119 -> 504352 (+0.05%); split: -0.00%, +0.05%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951>
2022-02-11 19:05:30 +00:00
Ian Romanick
59889eb3ae Re-indentation after the previous commit
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14573>
2022-02-11 17:25:34 +00:00
Ian Romanick
912299cb39 glsl: Eliminate ir_assignment::condition
Reformatting is left for the next commit.

v2: Remove assignments from the contructors. :face_palm:

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14573>
2022-02-11 17:25:33 +00:00
Ian Romanick
fb630cd783 glsl: Make ir_assignment::condition private
And add get_condition().

This proof that nothing remains that could possibly set ::condition to
anything other than NULL.

v2: Fix bad rebase.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14573>
2022-02-11 17:25:33 +00:00
Ian Romanick
5208c116f2 glsl: Don't visit rvalues in the condition of an assignment
At this point, this should always be NULL.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14573>
2022-02-11 17:25:33 +00:00
Ian Romanick
1c22f06970 glsl: Don't lower vector indexing in the condition of an assignment
At this point, this should always be NULL.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14573>
2022-02-11 17:25:33 +00:00
Ian Romanick
2652b9a83d glsl: Don't split structures in the condition of an assignment
At this point, this should always be NULL.

v2: Fix bad rebase.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14573>
2022-02-11 17:25:33 +00:00
Ian Romanick
d07551ad4e glsl: Don't split arrays in the condition of an assignment
At this point, this should always be NULL.

v2: Fix bad rebase.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14573>
2022-02-11 17:25:33 +00:00
Ian Romanick
41f6b42b08 glsl: Don't tree graft in the condition of an assignment
At this point, this should always be NULL.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14573>
2022-02-11 17:25:33 +00:00
Ian Romanick
97ffca80a8 glsl: Don't dead-built-in varying eliminate in the condition of an assignment
At this point, this should always be NULL.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14573>
2022-02-11 17:25:33 +00:00
Ian Romanick
231459ad26 glsl: Remove unused condition parameter from ir_assignment constructor
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14573>
2022-02-11 17:25:33 +00:00
Ian Romanick
ec5c9649ba glsl: Don't constant-fold the condition of an assignment
At this point, this should always be NULL.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14573>
2022-02-11 17:25:33 +00:00
Ian Romanick
697a460e49 glsl: Don't clone assignment conditions
At this point, this should always be NULL.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14573>
2022-02-11 17:25:33 +00:00
Ian Romanick
d27e3c6adc glsl: Eliminate unused conditional assignment constructor
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14573>
2022-02-11 17:25:33 +00:00
Ian Romanick
50a0d0d875 glsl: Remove the ability to read text IR with conditional assignments
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14573>
2022-02-11 17:25:33 +00:00
Ian Romanick
710adf2e60 glsl: Add ir_assignment constructor that takes just a write mask
The other constructor that takes a write mask and a condition will be
removed shortly.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14573>
2022-02-11 17:25:33 +00:00
Ian Romanick
afee5dc63f glsl: Lower if to conditional select instead of conditional assignment
Platforms that don't have flow control also don't have anything that
could be written that has a side effect.  It should be safe to implement
these condition writes as

    foo = csel(condition, bar, foo);

This should eliminate the last thing in the GLSL compiler that can
create new conditions on assignments.  Everything else that can store
something in ir_assignment::condition derives it from a pre-existing
condition.

v2: Fix bad rebase.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14573>
2022-02-11 17:25:33 +00:00
Ian Romanick
c3626e1580 glsl/ir_builder: Eliminate unused conditional assignment builders
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14573>
2022-02-11 17:25:33 +00:00
Ian Romanick
46a9df6aac glsl: Don't try to emit the "linear sequence" in lower_variable_index_to_cond_assign
When there are four or fewer elements left in the array partition, the
strategy changes from a binary search of nested flow control to sequence
of conditional assignments like

    (assign, dest, src[constant_i+0], index == constant_i+0)
    (assign, dest, src[constant_i+1], index == constant_i+1)
    (assign, dest, src[constant_i+2], index == constant_i+2)
    (assign, dest, src[constant_i+3], index == constant_i+3)

or

    (assign, dest[constant_i+0], src, index == constant_i+0)
    (assign, dest[constant_i+1], src, index == constant_i+1)
    (assign, dest[constant_i+2], src, index == constant_i+2)
    (assign, dest[constant_i+3], src, index == constant_i+3)

Realistically, the first case should use ir_triop_csel instead.

The second case will either get turned back into flow control like

    if (index == constant_i+0)
        (assign, dest[constant_i+0], src)

    if (index == constant_i+1)
        (assign, dest[constant_i+1], src)

    if (index == constant_i+2)
        (assign, dest[constant_i+2], src)

    if (index == constant_i+3)
        (assign, dest[constant_i+3], src)

or a sequence of conditional selects like

    (assign, dest[constant_i+0], (csel, index == constant_i+0, src, dest[constant_i+0]))
    (assign, dest[constant_i+1], (csel, index == constant_i+1, src, dest[constant_i+1]))
    (assign, dest[constant_i+2], (csel, index == constant_i+2, src, dest[constant_i+2]))
    (assign, dest[constant_i+3], (csel, index == constant_i+3, src, dest[constant_i+3]))

The former case should continue to use the binary search.  The later
case could be generated from the binary search by other lowering passes.

At the end of the day, conditional assignments don't really help
anything here, so stop using them.

Radeon R430
total instructions in shared programs: 2398683 -> 2398419 (-0.01%)
instructions in affected programs: 5143 -> 4879 (-5.13%)
helped: 9
HURT: 8

total vinst in shared programs: 616292 -> 616010 (-0.05%)
vinst in affected programs: 4467 -> 4185 (-6.31%)
helped: 9
HURT: 8

total sinst in shared programs: 315417 -> 315667 (0.08%)
sinst in affected programs: 2568 -> 2818 (9.74%)
helped: 2
HURT: 15

total flowcontrol in shared programs: 1049 -> 1048 (-0.10%)
flowcontrol in affected programs: 7 -> 6 (-14.29%)
helped: 1
HURT: 0

total presub in shared programs: 47027 -> 47027 (0.00%)
presub in affected programs: 127 -> 127 (0.00%)
helped: 1
HURT: 1

total omod in shared programs: 3618 -> 3615 (-0.08%)
omod in affected programs: 8 -> 5 (-37.50%)
helped: 3
HURT: 0

total temps in shared programs: 450757 -> 451312 (0.12%)
temps in affected programs: 837 -> 1392 (66.31%)
helped: 8
HURT: 6

total consts in shared programs: 1031928 -> 1031920 (<.01%)
consts in affected programs: 1211 -> 1203 (-0.66%)
helped: 6
HURT: 7

The shaders that were hurt for temps... are all lies.  None of those
shaders should have compiled as all 6 had more than 32 temps to begin
with.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14573>
2022-02-11 17:25:33 +00:00
Ian Romanick
05703a49f9 glsl: Use csel in do_vec_index_to_cond_assign
This matches what NIR does.  See nir_vector_extract.

This improves code generation for several reasons.  First, it only
requires 3 comparisons instead of 4 (vec3(i > 0, i > 1, i > 2) vs
vec4(i == 0, i == 1, i == 2, i == 3)).

Secondly, it shortens the liverange of some values, possibly quite
dramatically.  Consider a loop in the old version (after lowering
if-statements to selects):

    loop {
        ...
	x = csel(i == 0, a[0], x);
	x = csel(i == 1, a[1], x);
	x = csel(i == 2, a[2], x);
	x = csel(i == 3, a[3], x);
        ...
    }

x is live for the whole loop across iterations.  In the new version, x
is only live while its value is needed:

    loop {
        ...
	t0 = csel(i > 0 , a[1], a[0]);
	t1 = csel(i > 2 , a[3], a[2]);
	x = csel(i > 1, t1, t0);
        ...
    }

Outside a loop, this also means more values of the array may have their
liveness reduced sooner (by consuming two values at once).

All Intel platforms had similar results. (Tigerlake shown)
total instructions in shared programs: 21171336 -> 21163615 (-0.04%)
instructions in affected programs: 89680 -> 81959 (-8.61%)
helped: 40
HURT: 4
helped stats (abs) min: 1 max: 450 x̄: 193.68 x̃: 196
helped stats (rel) min: 0.41% max: 13.32% x̄: 6.01% x̃: 6.22%
HURT stats (abs)   min: 1 max: 12 x̄: 6.50 x̃: 6
HURT stats (rel)   min: 0.50% max: 0.66% x̄: 0.58% x̃: 0.58%
95% mean confidence interval for instructions value: -229.68 -121.28
95% mean confidence interval for instructions %-change: -6.93% -3.89%
Instructions are helped.

total cycles in shared programs: 832879641 -> 829513122 (-0.40%)
cycles in affected programs: 44738430 -> 41371911 (-7.52%)
helped: 35
HURT: 2
helped stats (abs) min: 2 max: 189948 x̄: 96186.49 x̃: 116154
helped stats (rel) min: 0.37% max: 11.08% x̄: 5.88% x̃: 6.47%
HURT stats (abs)   min: 4 max: 4 x̄: 4.00 x̃: 4
HURT stats (rel)   min: 0.69% max: 0.69% x̄: 0.69% x̃: 0.69%
95% mean confidence interval for cycles value: -112881.94 -69092.06
95% mean confidence interval for cycles %-change: -6.77% -4.27%
Cycles are helped.

total spills in shared programs: 8061 -> 7338 (-8.97%)
spills in affected programs: 873 -> 150 (-82.82%)
helped: 24
HURT: 0

total fills in shared programs: 7501 -> 6388 (-14.84%)
fills in affected programs: 1389 -> 276 (-80.13%)
helped: 24
HURT: 0

Radeon R430
total instructions in shared programs: 2449852 -> 2449136 (-0.03%)
instructions in affected programs: 6285 -> 5569 (-11.39%)
helped: 64
HURT: 0
helped stats (abs) min: 4 max: 12 x̄: 11.19 x̃: 12
helped stats (rel) min: 8.16% max: 21.62% x̄: 12.09% x̃: 10.91%

total consts in shared programs: 1032517 -> 1032482 (<.01%)
consts in affected programs: 966 -> 931 (-3.62%)
helped: 35
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 2.94% max: 10.00% x̄: 4.26% x̃: 3.57%

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14573>
2022-02-11 17:25:33 +00:00