Commit graph

10167 commits

Author SHA1 Message Date
Daniel Schürmann
8ace685432 radv/rt: inline radv_rt_pipeline_create_() helper into radv_rt_pipeline_create()
This saves some re-allocation and will help in future with RT shader functions.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18755>
2022-09-26 13:03:44 +00:00
Daniel Schürmann
40366a3aaf radv/rt: create separate radv_rt_pipeline struct
inherited from radv_compute_pipeline to contain all RT-related information.
This will make it easier to transition to RT shader functions.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18755>
2022-09-26 13:03:44 +00:00
Mike Blumenkrantz
007c6b1dd2 radv: use direct access to last_vgt_api_stage_locs for sgpr emission
radv_lookup_user_sgpr is heavy, stop using it

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18808>
2022-09-26 11:40:26 +00:00
Mike Blumenkrantz
351c4ed6d1 radv: store pointer to sgprs for last vertex stage
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18808>
2022-09-26 11:40:26 +00:00
Marcin Ślusarz
0337f449a6 radv: use nir_shader_instructions_pass in radv_nir_lower_ycbcr_textures
Changes:
- nir_metadata_preserve(..., nir_metadata_all) is called when pass doesn't
  make progress

Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12282>
2022-09-26 11:13:03 +00:00
Mike Blumenkrantz
a54d996463 radv: ALWAYS_INLINE radv_is_streamout_enabled()
v2 by Timur Kristóf:
- Use ALWAYS_INLINE instead of just inline.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18807>
2022-09-26 10:51:25 +00:00
Mike Blumenkrantz
051594fb7f radv: ALWAYS_INLINE radv_flush_descriptors
B I G  P E R F

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18807>
2022-09-26 10:51:25 +00:00
Mike Blumenkrantz
876f7f60ac radv: ALWAYS_INLINE radv_after_draw
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18807>
2022-09-26 10:51:25 +00:00
Samuel Pitoiset
7985a9df9a radv: enable NGG culling unconditionally for GPL but disable it dynamically
When compiling the pre-rasterization stages we don't know the primitive
topology, but we still want to enable NGG culling for performance. To
achieve that, NGG culling is enabled unconditionally when the topology
is unknown and disabled dynamically for points or lines.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18776>
2022-09-26 09:28:14 +02:00
Samuel Pitoiset
e7f6786d59 radv: use the maximum number of vertices per primitives for NGG with GPL
The hw will ignore the extra bits when points/lines are drawn, so this
just works as-is.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18776>
2022-09-26 09:28:14 +02:00
Samuel Pitoiset
e364670e83 radv: determine the last VGT api stage also for GPL
When compiling the pre-rasterization stages, we need to know this.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18776>
2022-09-26 09:28:14 +02:00
Samuel Pitoiset
e053f6feb3 radv: remove useless gfx10_ngg_info::enable_vertex_grouping
It's always TRUE and this will simplify future changes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18776>
2022-09-26 09:28:14 +02:00
Samuel Pitoiset
efd83e9627 radv: allow to build the pre-rasterization stages in a library
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18672>
2022-09-26 07:25:50 +00:00
Samuel Pitoiset
456543e6d8 radv: determine the last VGT api stage from the active_stages bitfield
With GPL, we can get binaries from libs directly, so this would fail.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18672>
2022-09-26 07:25:50 +00:00
Samuel Pitoiset
c08ba6a76c radv: import the GS copy shader from a library if present
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18672>
2022-09-26 07:25:50 +00:00
Samuel Pitoiset
2fd3b0bceb radv: do not free the GS copy shader binary if created from a library
Similar to other shader stages.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18672>
2022-09-26 07:25:50 +00:00
Konstantin Seurer
bf4a2b374f radv: Use scalar layout for BDA references
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18794>
2022-09-25 19:44:45 +00:00
Timur Kristóf
25e1c3d5b3 radv: Use a fallback for marketing name when libdrm doesn't know it.
Currently for GPUs which don't have a marketing name in libdrm,
RADV just prints "(null) (RADV ...)", which looks bad.

This commit replaces the "(null)" with "AMD Unknown".

Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18775>
2022-09-25 16:32:58 +00:00
Konstantin Seurer
0d66ed49b4 radv/rtpso: Use the common traversal helper
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18650>
2022-09-24 13:23:41 +00:00
Konstantin Seurer
3f72061be9 radv/rq: Use the common traversal helper
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18650>
2022-09-24 13:23:40 +00:00
Konstantin Seurer
ac45935345 radv: Add a common traversal build helper
Adds a helper for building the ray traversal loop to radv_rt_common.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18650>
2022-09-24 13:23:40 +00:00
Vinson Lee
e24a816818 radv: Fix file descriptor leak.
Fix defect reported by Coverity Scan.

Resource leak (RESOURCE_LEAK)
leaked_storage: Variable file going out of scope leaks the storage it points to.

Fixes: 5749806754 ("radv: Add Radeon Raytracing Analyzer trace dumping utilities")
Suggested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18660>
2022-09-23 22:55:14 -07:00
Bas Nieuwenhuizen
5a3411567a radv: Properly initialize all memory in RRA dumps.
Helps debugging when making RADV BVH format changes.

changes:

1. Change bool to uint32_t because padding was uninitialized
2. Named bitfields, because unnamed bitfields were never initialized.
3. Use calloc where possible.

Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18734>
2022-09-24 02:01:35 +02:00
Bas Nieuwenhuizen
19aae06692 radv: Use deterministic order for dumping acceleration stuctures.
To provide deterministic RRA dumps.

Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18734>
2022-09-24 02:00:42 +02:00
Bas Nieuwenhuizen
50bb0d6427 radv: Use GLSL matrices for instance transforms in BVH.
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18692>
2022-09-23 22:52:23 +00:00
Bas Nieuwenhuizen
3c09681edd radv: Use proper matrices for instance nodes.
Converts both wto and otw matrices to be full row-major 4x3 matrices.

Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18692>
2022-09-23 22:52:23 +00:00
Bas Nieuwenhuizen
0f9fb8e15f radv: Remove aabb bounds from instance nodes.
I need space ...

Furthermore, this only gets used during the build, and we can eat
the cost of generating the AABB a second time there.

Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18692>
2022-09-23 22:52:23 +00:00
Bas Nieuwenhuizen
b1ddb35040 radv: Translate the BVH copy shader to glsl from nir_builder.
Much easier to change.

Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18692>
2022-09-23 22:52:23 +00:00
Bas Nieuwenhuizen
ffc5f52724 radv: Hardcode root node id.
Optimizes code a tiny bit, and avoid the hack of encoding the root
node id in the low bits of the BLAS address in an instance node.

This is needed to adjust serialization/deserialization as the
instance address there has to be the base address, so this avoids
some wrapping/unwrapping.

Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18692>
2022-09-23 22:52:23 +00:00
Georg Lehmann
513d074d39 radv: Fix GLSL BDA struct alignment and use pointer arithmetic SIZEOF.
Use pointer arithmetic from GL_EXT_buffer_reference2 to replace explicit
struct sizes. For that to work we also need to fix BDA alignment by setting
it to the smallest possible value (default is 16).

We could also replace the INDEX macro with a[b] operators, but that's actually
a change in behavior because a[b] always uses 64bit arithmetic.

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18765>
2022-09-23 21:07:00 +00:00
Samuel Pitoiset
eded5bda4e radv/ci: add piglit testing with Zink on NAVI10
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18315>
2022-09-23 11:15:16 +02:00
Samuel Pitoiset
31dc15b977 radv: rework and rename radv_make_buffer_descriptor()
Pass the VA directly and rename to radv_make_texel_buffer_descriptor().
This helper will be re-used for future work.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18750>
2022-09-23 06:55:21 +00:00
Samuel Pitoiset
81d7a6bdff radv: remove unnecessary radv_buffer_view::vk_format
It's not really useful.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18750>
2022-09-23 06:55:21 +00:00
Samuel Pitoiset
a5ca3c1638 radv: pass a VkSampler to write_sampler_descriptor()
For future work, and remove useless radv_device parameter.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18750>
2022-09-23 06:55:21 +00:00
Yonggang Luo
a144f3f80c radv: Getting radeon_icd to be generated properly on win32
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18747>
2022-09-22 17:54:24 +00:00
Daniel Schürmann
cd36a29759 aco/optimizer: change inverse_comparison in-place
This avoids creating a second comparison which is more expensive
than just using the existing s_not.

Totals from 1089 (0.81% of 134913) affected shaders: (GFX10.3)
VGPRs: 50472 -> 50184 (-0.57%)
CodeSize: 4724692 -> 4760824 (+0.76%); split: -0.03%, +0.79%
MaxWaves: 23964 -> 24012 (+0.20%)
Instrs: 859588 -> 859687 (+0.01%); split: -0.11%, +0.12%
Latency: 10674653 -> 10650353 (-0.23%); split: -0.41%, +0.18%
InvThroughput: 1752987 -> 1750238 (-0.16%); split: -0.20%, +0.04%
VClause: 20921 -> 20872 (-0.23%); split: -0.68%, +0.45%
SClause: 31417 -> 31550 (+0.42%)
Copies: 69428 -> 68738 (-0.99%); split: -1.52%, +0.53%
PreSGPRs: 48033 -> 49649 (+3.36%)
PreVGPRs: 44490 -> 43699 (-1.78%)

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18253>
2022-09-22 12:35:11 +00:00
Timur Kristóf
c8445c1691 aco: Change inverse-comparison optimization to work with s_not
Some time ago we stopped using s_andn2 with exec for boolean NOT.
The reasoning behind that change was that those booleans will be
always ANDed with exec when necessary.

This inhibited the inverse-comparison optimization in most cases
which is fixed by this patch.

Fossil DB stats on Navi 21:

Totals from 12251 (9.08% of 134913) affected shaders:
VGPRs: 801744 -> 802016 (+0.03%); split: -0.00%, +0.04%
SpillSGPRs: 8863 -> 8893 (+0.34%)
CodeSize: 100593244 -> 100370684 (-0.22%); split: -0.22%, +0.00%
MaxWaves: 204994 -> 204948 (-0.02%); split: +0.00%, -0.02%
Instrs: 18717001 -> 18668965 (-0.26%); split: -0.26%, +0.00%
Latency: 263255046 -> 262874896 (-0.14%); split: -0.16%, +0.02%
InvThroughput: 52760249 -> 52721736 (-0.07%); split: -0.08%, +0.01%
VClause: 329631 -> 329680 (+0.01%); split: -0.03%, +0.04%
SClause: 681563 -> 681435 (-0.02%); split: -0.02%, +0.00%
Copies: 1331612 -> 1372446 (+3.07%); split: -0.03%, +3.10%
Branches: 548325 -> 548301 (-0.00%)
PreSGPRs: 911317 -> 909700 (-0.18%)
PreVGPRs: 766279 -> 767070 (+0.10%)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18253>
2022-09-22 12:35:11 +00:00
Daniel Schürmann
cf5f9854bc aco/optimizer: optimize s_and(exec, s_and(x, y)) more aggressively
It is enough if either of the two operands is respecting the
current exec mask.

Totals from 1680 (1.25% of 134913) affected shaders: (GFX10.3)
CodeSize: 3929436 -> 3922372 (-0.18%); split: -0.18%, +0.00%
Instrs: 730305 -> 728536 (-0.24%); split: -0.24%, +0.00%
Latency: 6839314 -> 6835154 (-0.06%); split: -0.07%, +0.01%
InvThroughput: 1371351 -> 1371267 (-0.01%); split: -0.01%, +0.00%
SClause: 32819 -> 32802 (-0.05%); split: -0.09%, +0.04%
Copies: 33264 -> 33271 (+0.02%); split: -0.01%, +0.03%

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18253>
2022-09-22 12:35:11 +00:00
Daniel Schürmann
79a8e8b5b2 aco/optimizer: do can_eliminate_and_exec() optimization later
This will allow to optimize s_not(v_cmp()) safely.

Totals from 1024 (0.76% of 134913) affected shaders: (GFX10.3)
CodeSize: 5695424 -> 5701860 (+0.11%); split: -0.00%, +0.11%
Scratch: 242688 -> 241664 (-0.42%)
Instrs: 1040656 -> 1041635 (+0.09%); split: -0.00%, +0.09%
Latency: 16842282 -> 16922790 (+0.48%); split: -0.06%, +0.54%
InvThroughput: 4772728 -> 4810868 (+0.80%); split: -0.10%, +0.90%
VClause: 20013 -> 20000 (-0.06%); split: -0.12%, +0.05%
Copies: 115057 -> 114384 (-0.58%); split: -1.22%, +0.63%
Branches: 34531 -> 34532 (+0.00%); split: -0.00%, +0.01%
PreSGPRs: 46263 -> 46267 (+0.01%)

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18253>
2022-09-22 12:35:11 +00:00
Yonggang Luo
c74595ead3 radv/r600/clover: Getting libelf to be optional
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18503>
2022-09-22 05:07:35 +00:00
Georg Lehmann
84c0529258 aco: Unswizzle v_pk_fma_f16 literals to produce more v_pk_fmac_f16.
No Foz-DB difference, but it reduces code size in some angle shaders.

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18676>
2022-09-21 23:04:06 +00:00
Samuel Pitoiset
578e30f3e6 radv: make sure to initialize wd_switch_on_eop before checking its value
This is technically not a bug because it might just trigger
SWITCH_ON_EOI when streamout is used and I think it was fine.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7303
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18700>
2022-09-21 18:20:58 +00:00
Timur Kristóf
ed76471d08 aco/optimizer_postRA: Clarify terminology.
Change the terminology around the post-RA optimizer, primarily this
changes the use of "clobbered" to "overwritten" to avoid confusion,
and it removes some redundant states.

Proposed for backporting to stable, to make sure it is easy to
backport further fixes (if any) on top of this.

Fossil DB stats unaffected on Navi 21.

Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18488>
2022-09-21 16:56:57 +00:00
Timur Kristóf
a8dd07518c aco/optimizer_postRA: Fix logical control flow handling.
Change reset_block() so it only considers the logical
predecessors for VGPRs. Relevant for some optimizations
across loops.

This commit fixes an assertion failure which was triggered
by Zink in a piglit test.

Fossil DB stats unaffected on Navi 21.

Fixes: 2e56e23420
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18488>
2022-09-21 16:56:57 +00:00
Timur Kristóf
2eab413cf7 aco/optimizer_postRA: Don't assume all operand registers were written by same instr.
This assumption is no longer true since the post-RA optimizer
can work across blocks. It is now possible that some control
flow paths overwrite some but not all registers of an operand.

This commit may prevent invalid optimizations and/or assertion
failures (on debug builds).

Fossil DB stats unaffected on Navi 21.

Fixes: 0e4747d3fb
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18488>
2022-09-21 16:56:57 +00:00
Timur Kristóf
63063dd5ce aco/optimizer_postRA: Mark a register overwritten when predecessors disagree.
Affects blocks whose some (but not all) predecessors overwrite a register.
This commit fixes glitches in some games which regressed because of the
improved SCC no-compare optimization.

Fossil DB stats on Navi 21:

Totals from 2816 (2.09% of 134906) affected shaders:
CodeSize: 24224276 -> 24241580 (+0.07%)
Instrs: 4570595 -> 4574921 (+0.09%)
Latency: 53680256 -> 53693655 (+0.02%); split: -0.00%, +0.02%
InvThroughput: 9829289 -> 9830573 (+0.01%)

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7257
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7305
Fixes: 2e56e23420
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18488>
2022-09-21 16:56:57 +00:00
Timur Kristóf
5e80edfa78 aco/tests: Add post-RA SCC no-compare tests cases with control flow.
- scc_nocmp_across_cf: passes
- scc_nocmp_across_cf_partially_overwritten: fails (fixed later)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18488>
2022-09-21 16:56:56 +00:00
Timur Kristóf
d4b3f81d94 aco/tests: Add post-RA DPP test cases with control flow.
These are intended to make sure that the post-RA optimizer works
correctly across control flow. The new tests emit a divergent
if-else branch (with full logical+linear CFG).

- dpp_across_cf:
  Simple case of DPP optimizable across control flow. Should pass.
- dpp_across_cf_overwritten:
  Similar case but the DPP source register is overwritten in CF.
  This shows a bug so the test fails now (will be fixed).

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18488>
2022-09-21 16:56:56 +00:00
Timur Kristóf
d7cd49d54b aco/tests: Add post-RA optimizer testcase for partially overwritten VCC.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18488>
2022-09-21 16:56:56 +00:00
Timur Kristóf
2274b26dfb ac/nir/ngg: Don't initialize same-invocation mesh shader outputs.
This is actually not necessary and generates a lot of superfluous
instructions at every phi (setting the value to zero).

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18566>
2022-09-21 16:14:59 +00:00