Commit graph

6114 commits

Author SHA1 Message Date
Rhys Perry
b2ce7da149 radv: do nir_lower_bit_size after algebraic optimizations
There are too many algebraic optimizations to be certain that one of them
couldn't create instructions which need lowering. It also creates better
code for some reason.

fossil-db (parallel-rdp, Navi):
Totals from 217 (31.77% of 683) affected shaders:
VGPRs: 7716 -> 7672 (-0.57%)
CodeSize: 1516152 -> 1510688 (-0.36%); split: -0.38%, +0.02%
MaxWaves: 3964 -> 3982 (+0.45%)
Instrs: 269445 -> 268508 (-0.35%); split: -0.36%, +0.02%
Cycles: 37963416 -> 37912592 (-0.13%); split: -0.15%, +0.01%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4791>
2020-11-04 11:50:37 +00:00
Rhys Perry
c77114967f radv: move a few passes to after load/store vectorization
load/store vectorization can create 8/16-bit alu to do packing/unpacking,
which would make shader_info::bit_sizes_used out of date.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4791>
2020-11-04 11:50:37 +00:00
Jason Ekstrand
9d377c01d0 nir: Make nir_deref_instr::mode a bitfield
We rename it to "modes" to make it clear that it may contain more than
one mode and adjust all the uses of nir_deref_instr::modes to attempt to
handle multiple modes.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6332>
2020-11-03 22:18:28 +00:00
Jason Ekstrand
3cc58e6470 nir: Add and use some deref mode helpers
NIR derefs currently have exactly one variable mode.  This is about to
change so we can handle OpenCL generic pointers.  In order to transition
safely, we need to audit every deref->mode check.  This commit adds a
set of helpers that provide more nuanced mode checks and converts most
of NIR to use them.

For simple cases, we add nir_deref_mode_is and nir_deref_mode_is_one_of
helpers.  These can be used in passes which don't have to bother with
generic pointers and just want to know what mode a thing is.  If the
pass ever encounters generic pointers in a way that this check would be
unsafe, it will assert-fail to alert developers that they need to think
harder about things and fix the pass.

For more complex passes which require a more nuanced understanding of
modes, we add nir_deref_mode_may_be and nir_deref_mode_must_be helpers
which accurately describe the compiler's best knowledge about the given
deref.  Unfortunately, we may not be able to exactly identify the mode
in a generic pointers scenario so we have to be very careful when we use
these.  Conversion of these passes is left to later commits.

For the case of mass lowering of a particular mode (nir_lower_explicit_io
is one good example), we add nir_deref_mode_is_in_set.  This is also
pretty assert-happy like nir_deref_mode_is but is for a set containment
comparison on deref modes where you expect the deref to either be all-in
or all-out.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6332>
2020-11-03 22:18:28 +00:00
James Park
bfa9fd88fc radv,radv/winsys: Move RADV_MAX_IBS_PER_SUBMIT
RADV_MAX_IBS_PER_SUBMIT needs to be defined even for the null driver.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7394>
2020-11-03 15:50:38 +00:00
Samuel Pitoiset
57c152af9c aco: select v_mul_{hi}_u32_u24 for 24-bit multiplications
This is based on the NIR range analysis. v_mul_u32_u24 is VOP2, while
v_mul_lo_u32 is VOP3, so that should reduce codesize.

fossils-db (Vega10):
Totals from 12590 (9.22% of 136546) affected shaders:
SGPRs: 680207 -> 677271 (-0.43%); split: -0.47%, +0.04%
VGPRs: 620840 -> 620856 (+0.00%); split: -0.02%, +0.02%
CodeSize: 37930200 -> 37774088 (-0.41%); split: -0.41%, +0.00%
Instrs: 7463550 -> 7458120 (-0.07%); split: -0.07%, +0.00%
Cycles: 133487628 -> 133427532 (-0.05%); split: -0.05%, +0.00%
VMEM: 2514729 -> 2513426 (-0.05%); split: +0.02%, -0.08%
SMEM: 1533579 -> 1532795 (-0.05%); split: +0.05%, -0.10%
VClause: 231391 -> 231389 (-0.00%); split: -0.01%, +0.00%
SClause: 255352 -> 255294 (-0.02%); split: -0.04%, +0.02%
Copies: 605821 -> 600352 (-0.90%); split: -0.92%, +0.02%
Branches: 133739 -> 133743 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 351092 -> 348048 (-0.87%)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7405>
2020-11-03 13:47:40 +00:00
Samuel Pitoiset
3a72021d7c aco: store NIR range analysis data to the isel context
It will be used to optimize some ALU instructions.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7405>
2020-11-03 13:47:40 +00:00
Rhys Perry
ac65d3b6b8 radv: fix shader caching with NaN fixup workaround
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: 6f21995f98 ("radv: add new drirc option radv_enable_mrt_output_nan_fixup")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7423>
2020-11-03 11:27:31 +00:00
Rhys Perry
36f62494ec radv: fix shader caching with discard->demote workaround
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: bdd7587414 ("radv: use nir_lower_discard_to_demote to work around game bugs")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7423>
2020-11-03 11:27:31 +00:00
Rhys Perry
19f3911cf8 radv: add some missing radv_{start,stop}_feedback
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7339>
2020-11-03 11:10:01 +00:00
James Park
4bd18e772a amd/llvm,aco: Replace VLA with alloca
MSVC will never support VLA, so use alloca instead.

Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7157>
2020-11-03 07:44:02 +00:00
Samuel Pitoiset
03f260cb27 radv,aco: optimize computing the sample mask for per-sample shading
I don't know why these values were introduced for but it seems like
we can optimize this by just doing:

gl_SampleMaskIn[0] = (SampleCoverage & (1 << gl_SampleID))

AMDGPU-PRO and AMDVLK apply the same formula to compute the
sample mask when per-sample shading is enabled.

No fossils-db changes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7377>
2020-11-02 08:05:47 +01:00
Samuel Pitoiset
c63bcda22c radv,aco: adjust the sample mask only if per-sample shading is enabled
When per-sample shading isn't enabled, we can just load the
samplemask from the hardware which is always the coverage of
the entire pixel/fragment.

fossilds-db (VEGA10):
Totals from 131 (0.10% of 136546) affected shaders:
SGPRs: 5056 -> 5048 (-0.16%)
VGPRs: 2600 -> 2372 (-8.77%)
CodeSize: 115788 -> 112560 (-2.79%)
MaxWaves: 1266 -> 1274 (+0.63%)
Instrs: 20620 -> 20071 (-2.66%)
Cycles: 82416 -> 80220 (-2.66%)
VMEM: 51567 -> 35532 (-31.10%); split: +0.24%, -31.34%
SMEM: 8952 -> 8258 (-7.75%); split: +0.11%, -7.86%
SClause: 1223 -> 1199 (-1.96%); split: -2.62%, +0.65%
Copies: 1247 -> 1124 (-9.86%); split: -10.18%, +0.32%
PreVGPRs: 2112 -> 1981 (-6.20%)

Helps Britannia, Shadow of the Tomb Raider, Warhammer II and Control.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7377>
2020-11-02 08:05:43 +01:00
Bas Nieuwenhuizen
8943c80c9b radv: Fix variable name collision.
idx was aliased, and eb104e949e started
using the outer var in the inner scope ...

Fixes: eb104e949e
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3701
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7388>
2020-10-30 23:44:48 +01:00
Mauro Rossi
e54c7f4b1a android: aco: add aco_form_hard_clauses.cpp to Makefile.sources
Fixes the following building error:

external/mesa/src/amd/compiler/aco_interface.cpp:160:
error: undefined reference to 'aco::form_hard_clauses(aco::Program*)'

Fixes: 3dfbed2a8 ("aco: create s_clause on GFX10+")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7380>
2020-10-30 13:34:06 +00:00
James Park
6d058ac6c9 aco: Fix accidental copies, attempt two
Use auto to avoid mistyping the constness of the pair key, which
triggers implicit conversions rather than compilation errors.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7346>
2020-10-30 09:55:53 +00:00
Rhys Perry
1761379481 aco: handle SDWA in the optimizer
Apply SGPRs/modifiers when possible and try not to break when SDWA
instructions are encountered.

No shader-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7349>
2020-10-29 18:08:31 +00:00
Rhys Perry
ecc5b59a70 aco: don't allow destination opsel for v_cvt_pknorm
It doesn't make sense to do this.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7349>
2020-10-29 18:08:31 +00:00
Rhys Perry
bb890f2e7c aco: fix combine_inverse_comparison()
fossil-db (Navi):
Totals from 16 (0.01% of 137413) affected shaders:
CodeSize: 6788 -> 6724 (-0.94%)
Instrs: 1250 -> 1234 (-1.28%)
Cycles: 4984 -> 4920 (-1.28%)

fossil-db (Polaris):
Totals from 16 (0.01% of 138881) affected shaders:
CodeSize: 7024 -> 6960 (-0.91%)
Instrs: 1337 -> 1321 (-1.20%)
Cycles: 5332 -> 5268 (-1.20%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7349>
2020-10-29 18:08:31 +00:00
Rhys Perry
7e4aa8c8e9 aco: fix printing of some sdwa sels
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7349>
2020-10-29 18:08:31 +00:00
Rhys Perry
70320f4117 aco: assert a label only uses one of the members in ssa_info's union
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7349>
2020-10-29 18:08:31 +00:00
Rhys Perry
3dfbed2a87 aco: create s_clause on GFX10+
This seems to give no measurable benefit to Strange Brigade or Shadow of
Mordor, but it's simple to do, helps in theory and all other compilers do
it.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5919>
2020-10-29 15:08:05 +00:00
Daniel Schürmann
f4c090a3b3 aco: refactor split_store_data() to always split into evenly sized elements
This fixes a couple of issues on GFX67 and
has no negative impact on newer hardware

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7105>
2020-10-29 14:32:59 +00:00
Timur Kristóf
f74ef15879 aco/ngg: Incorporate GS invocations into workgroup size calculation.
If the workgroup_size variable is lower than the actual workgroup size,
that means it's possible that ACO won't emit some s_barrier instructions
when in fact it should. This can possibly cause a GPU hang.

This is just for the sake of general correctness, currently this
can't cause a real problem because the maximum vertex count is always
greater than (or equal to) the primitive count in GS, and already
takes into account the number of GS invocations.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7232>
2020-10-28 21:55:54 +01:00
Timur Kristóf
09b9e52c0d aco/ngg: Export a zero-area triangle when primitive count is 0.
This is a workaround for a bug in Navi 1x NGG HW.

Very rarely, the Navi 1x PA can hang when an NGG workgroup exports
0 total primitives. According to AMD, we always need this workaround
when it is possible that the number of primitives is 0.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7232>
2020-10-28 21:55:47 +01:00
Timur Kristóf
73449f9a62 aco: Add a few assertions about LDS usage.
This is to make sure we don't compile a shader which doesn't
fit the available LDS space.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7232>
2020-10-28 21:47:22 +01:00
Timur Kristóf
b6654adc0e aco: Make emitting reduction instructions a bit more convenient.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7232>
2020-10-28 21:47:22 +01:00
Timur Kristóf
8d6246205a aco: Add some validation for PSEUDO_REDUCTION instructions.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7232>
2020-10-28 21:47:22 +01:00
Timur Kristóf
260f9c503a aco/ngg: Put shader query reduction operand into a VGPR.
The p_reduce instruction only works if this operand is in a VGPR,
and otherwise gets lowered to incorrect code.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7232>
2020-10-28 21:47:22 +01:00
Timur Kristóf
9757c3cb6b aco: Assert that workgroup barriers are not used inappropriately.
Example:
It is possible for some NGG GS waves to have 0 ES and/or GS invocations,
and in that case having an s_barrier inside divergent control flow can
very possibly hang the GPU.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7232>
2020-10-28 21:47:19 +01:00
Rhys Perry
ecdcf22d5d aco: switch aco_print_asm to a FILE *
Streams are really stateful and (IMO) difficult to read for non-trivial
usage. This is also more consistent with NIR and the rest of ACO.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7166>
2020-10-28 17:32:32 +00:00
Rhys Perry
a293fad4ef aco: refactor repeated instruction disassembly
This seems simpler to me. It should also work correctly when repeated
instructions cross blocks.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7166>
2020-10-28 17:32:32 +00:00
Rhys Perry
ed2449d55b aco: move individual instruction disassembly to its own helper
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7166>
2020-10-28 17:32:32 +00:00
Rhys Perry
483657de32 aco: use mubuf helper in select_gs_copy_shader
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6103>
2020-10-28 14:59:49 +00:00
Rhys Perry
ec7ecfe9cb aco: use control flow creation helpers in select_gs_copy_shader
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6103>
2020-10-28 14:59:49 +00:00
Rhys Perry
57d977a23f aco: round bytes_written to dwords if larger than 4 bytes
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7276>
2020-10-28 10:56:27 +00:00
Rhys Perry
41839d38cf aco: default to a definition size of 32
For non-arithmetic opcodes such as buffer_load_dword and buffer_load_short,
default to a definition size of 32.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7276>
2020-10-28 10:56:27 +00:00
Daniel Schürmann
fef8a4befd radv: remove call to nir_lower_pack()
The pack_* instructions are now lowered via nir_lower_alu_to_scalar()
and unpack_* are not lowered anymore.

These bitcasts are no-ops, and lowering prevents
some optimizations like vectorization.

Note: There are still some *_split variations remaining
from different other NIR passes.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6527>
2020-10-28 10:14:26 +00:00
Daniel Schürmann
212be2a04e radv: lower pack_[64/32]_* via nir_lower_alu_to_scalar()
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6527>
2020-10-28 10:14:26 +00:00
Daniel Schürmann
121fa017e1 ac/nir: implement nir_op_[un]pack_64_4x16
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6527>
2020-10-28 10:14:26 +00:00
Daniel Schürmann
543f50789a aco: implement nir_op_unpack_[64/32]_*
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6527>
2020-10-28 10:14:26 +00:00
Bas Nieuwenhuizen
eb104e949e radv: Do not access set layout during vkCmdBindDescriptorSets.
The spec says:

"
VkDescriptorSetLayout objects may be accessed by commands that operate on descriptor sets allocated using that layout
"

So our behavior is valid here, but this is a temporary workaround for an issue with Baldur's Gate 3.

CC: mesa-stable
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3607
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7207>
2020-10-28 03:06:20 +00:00
Bas Nieuwenhuizen
29999e6b9d radv: Fix 1D compressed mipmaps on GFX9.
Partial rollback as GFX9 really requires height = 1 to work.

The two substantial parts of the fix remaining:

1) Deal with views with multiple levels.
2) Limit the expansion to the base mip pitch/height. On GFX9 this
   is exactly equal to the surf_pitch that was used before. I've
   done some investigation to make sure that on GFX10 this always
   results in the right physical layout.

Remaining stupid question is how the actual extents for bounds
checking never end up too low when the size gets clamped, but
this change and the previous change don't change that ...

Fixes: 1fb3e1fb70 "radv: Fix mipmap extent adjustment on GFX9+."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7245>
2020-10-28 00:31:04 +00:00
Vinson Lee
fdb1997ab5 Fix VMware capitalization.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7260>
2020-10-27 15:33:40 -07:00
Rhys Perry
26e53e3afa aco: ignore the ACO-inserted continue in create_continue_phis()
Otherwise, for loops without continue_or_break, create_continue_phis()
always returns an undef operand.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 638cbc21a1 ("aco: handle when ACO adds new continue edges")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2848
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7148>
2020-10-27 19:53:38 +00:00
James Park
328a350387 vulkan/util,vulkan/wsi,radv: Add typed outarray API
MSVC cannot perform GCC __typeof__ for C code. (C++ has decltype.)

Add adjacent functions to allow specifying types manually.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7270>
2020-10-27 08:47:52 -07:00
Rhys Perry
437995bb70 aco: remove all-undef phi opt
This doesn't look like it would create correct IR for 8/16-bit phis and
doesn't seem to help anything. If we ever want to do this, it's probably
better done in nir_opt_remove_phis().

No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7216>
2020-10-27 15:24:38 +00:00
Rhys Perry
70ff262cda aco: use v_mov_b32_sdwa for some 16-bit constants
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7216>
2020-10-27 15:24:38 +00:00
Rhys Perry
b882598ee1 aco: remove some unused optimizations
These are unused now that we almost always use p_parallelcopy for simple
copies.

No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7216>
2020-10-27 15:24:38 +00:00
Rhys Perry
d20a752c0d aco: use Builder::copy more
fossil-db (Navi):
Totals from 6973 (5.07% of 137413) affected shaders:
SGPRs: 381768 -> 381776 (+0.00%)
VGPRs: 306092 -> 306096 (+0.00%); split: -0.00%, +0.00%
CodeSize: 24440844 -> 24421196 (-0.08%); split: -0.09%, +0.01%
MaxWaves: 86581 -> 86583 (+0.00%)
Instrs: 4682161 -> 4679578 (-0.06%); split: -0.06%, +0.00%
Cycles: 68793116 -> 68261648 (-0.77%); split: -0.83%, +0.05%

fossil-db (Polaris):
Totals from 8154 (5.87% of 138881) affected shaders:
VGPRs: 338916 -> 338920 (+0.00%); split: -0.00%, +0.00%
CodeSize: 23540428 -> 23540488 (+0.00%); split: -0.00%, +0.00%
MaxWaves: 49090 -> 49091 (+0.00%)
Instrs: 4576085 -> 4576101 (+0.00%); split: -0.00%, +0.00%
Cycles: 51720704 -> 51720888 (+0.00%); split: -0.00%, +0.00%

Most of the Navi cycle/instruction changes are from 8/16-bit parallel-rdp
shaders. They appear to be improved because the p_create_vector from
lower_subdword_phis() was blocking constant propagation.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7216>
2020-10-27 15:24:38 +00:00