Commit graph

131258 commits

Author SHA1 Message Date
Rhys Perry
15d08a06e2 aco/tests: expand optimize.const_comparison_ordering tests
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7045>
2020-11-13 12:34:27 +00:00
Rhys Perry
6bf3c606be aco/tests: initialize debug function
aco_log() will print the message to stderr.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7045>
2020-11-13 12:34:27 +00:00
Rhys Perry
966732e8ca aco: disallow various v_add_u32 opts if modifiers are used
Check for clamp, SDWA or DPP. The optimization isn't possible with SDWA
and DPP, so it would have been skipped anyway. Doing any of these with a
clamp modifier present would be incorrect.

No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7045>
2020-11-13 12:34:27 +00:00
Rhys Perry
91ffeed88a aco: fix combine_constant_comparison_ordering() NaN check with 16/64-bit
No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7045>
2020-11-13 12:34:27 +00:00
Rhys Perry
d4c821da0e aco: don't combine precise max(min()) to med3
fossil-db (Navi):
Totals from 241 (0.18% of 137413) affected shaders:
CodeSize: 856280 -> 856308 (+0.00%); split: -0.00%, +0.00%
Instrs: 164220 -> 164514 (+0.18%); split: -0.00%, +0.18%
Cycles: 1031916 -> 1033092 (+0.11%); split: -0.00%, +0.11%
VMEM: 77855 -> 78514 (+0.85%); split: +0.85%, -0.01%
SMEM: 20501 -> 20593 (+0.45%); split: +0.46%, -0.01%
Copies: 9791 -> 9790 (-0.01%); split: -0.03%, +0.02%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7045>
2020-11-13 12:34:27 +00:00
Pierre-Eric Pelloux-Prayer
6e7e208867 radeonsi: remove AMD_DEBUG=zerovram flag
The same feature is available by using: radeonsi_zerovram=true

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7525>
2020-11-13 11:19:58 +00:00
Pierre-Eric Pelloux-Prayer
b9605f1a74 radeonsi: remove unused NO_RB_PLUS flag
It's not used since https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1751.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7525>
2020-11-13 11:19:58 +00:00
Simon Ser
1cf1ece738 radv: add img debug flag
This is similar to AMD_DEBUG=tex, but for radv.

Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5734>
2020-11-13 11:32:17 +01:00
Simon Ser
dc93fd759a radeonsi: use ac_surface_print_info in si_print_texture_info
Pieces of information not printed by ac_surface_print_info are still
printed in si_print_texture_info.

Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5734>
2020-11-13 10:30:42 +01:00
Simon Ser
92470b3d74 amd/common: introduce ac_surface_print_info
This is mostly copied from si_print_texture_info, with the si-specific
bits removed. Moving it into common code will allow to use it from both
radeonsi and radv.

Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5734>
2020-11-13 10:30:37 +01:00
Erik Faye-Lund
ee657df09a meson: verify that d3d12.h exists when building the d3d12 driver
Without this header-file, we can't build the driver. So let's verify
that it exists, and can be used by the C++ compiler.

This should make it a bit more clear what's wrong if someone attempts to
build this using MinGW or on Linux.

Fixes: 2ea15cd661 ("d3d12: introduce d3d12 gallium driver")
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7575>
2020-11-13 08:40:52 +00:00
Erik Faye-Lund
314f18b22a microsoft/compiler: correct typo
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7578>
2020-11-13 08:35:34 +00:00
Erik Faye-Lund
4c82cdcb7e microsoft/compiler: inline some struct-declarations
We don't need to refer to these by name anywhere, so let's just inline
these for readability reasons.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7578>
2020-11-13 08:35:34 +00:00
Erik Faye-Lund
b9a99b22aa microsoft/compiler: move c++ higher up
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7578>
2020-11-13 08:35:34 +00:00
Erik Faye-Lund
a2a35b2d20 microsoft/compiler: remove unused struct
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7578>
2020-11-13 08:35:34 +00:00
Samuel Pitoiset
68488fd383 aco: optimize v_add(v_bcnt(a, 0), b) to v_bcnt(a, b)
The first operand of v_bcnt should always be a VGPR because if it's
a SGPR, isel selects s_bcnt1 but I added a sanity check to prevent
any problems.

fossils-db (Vega10):
Totals from 23 (0.02% of 139517) affected shaders:
CodeSize: 106828 -> 106664 (-0.15%)
Instrs: 20242 -> 20201 (-0.20%)
Cycles: 213112 -> 211352 (-0.83%)
VMEM: 3200 -> 3184 (-0.50%)
SMEM: 928 -> 927 (-0.11%)

Helps Control, Assassins Creeds Origins and Youngblood.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7568>
2020-11-13 07:28:50 +00:00
Martin Peres
580fbbb59a driconf: remove the redundant glx-extension-disabling options
Now that we introduced the generic glx_extension_override option,
we can remove the glx_disable_oml_sync_control,
glx_disable_sgi_video_sync, and glx_disable_ext_buffer_age ones.

It seems like the only user for them was the vmwgfx, and only for
Gnome and Compiz which are covered by the default mesa driconf. This
means that it is unlikely for a user to have these options set in
their local driconf file.

Suggested-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Martin Peres <martin.peres@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7252>
2020-11-13 08:48:34 +02:00
Bas Nieuwenhuizen
3fa3bc19a2 radeonsi: Add auxiliary plane support.
This adds support for multiple DRM planes for a single format plane
and uses that to enable DCC support with modifiers.

With the implicit flush patches we can also enable displayable DCC
both with and without DCC as the X server and compositors know not
to do frontbuffer rendering onto images with multiple DRM planes.

For now we require that the extra planes are essentially fixed though.
We require that the offset/stride are the same as ac_surface computes
and that all planes are in the same buffer. This is mainly for
simplicity and could be somewhat more relaxed in the future given
a strong usecase.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6176>
2020-11-13 03:27:28 +00:00
Bas Nieuwenhuizen
605140e401 radeonsi: Do not try to disable displayable DCC with modifiers.
We do flushing on glFlush etc., so we don't need explicit flush,
but we still need to avoid frontbuffer rendering.

For modifiers there was logic put in apps that basically prevent
frontbuffer rendering if multipe planes are involved.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6176>
2020-11-13 03:27:28 +00:00
Bas Nieuwenhuizen
23b59b6f87 radeonsi: Do not disable DCC when we have it as a modifier.
Because other processes might be expecting DCC.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6176>
2020-11-13 03:27:28 +00:00
Bas Nieuwenhuizen
c786150dfa radeonsi: Add modifier support.
This adds basic modifier support in radeonsi.

Support for import/export of DCC comes in a later patch as that
needs support for multiple memory planes.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6176>
2020-11-13 03:27:28 +00:00
Bas Nieuwenhuizen
f7a4051b83 radeonsi: Check pitch and offset for validity.
And lack of overflows, which should help for security.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6176>
2020-11-13 03:27:28 +00:00
Bas Nieuwenhuizen
395dac7bf9 amd/common: Add modifier tests.
This primarily tests that:
 - multiple GPUs with the same GPU modifier parameters result
   in the same tiling layout.
 - The size & alignment calculations don't change for a given
   modifier & image parameters.

It does this primarily based on addrlib. Radeonsi has used addrlib
for the retiling of displayable DCC for a while already, so the
DCC tiling should be pretty reliable.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6176>
2020-11-13 03:27:28 +00:00
Bas Nieuwenhuizen
0833dd7d12 amd/common: Add support for modifiers.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6176>
2020-11-13 03:27:28 +00:00
Bas Nieuwenhuizen
2cc2b45688 drm-uapi: Add AMD modifiers.
This adds modifiers for GFX9+ AMD GPUs.

As the modifiers need a lot of parameters I split things out in
getters and setters.
  - Advantage: simplifies the code a lot
  - Disadvantage: Makes it harder to check that you're setting all
                  the required fields.

The tiling modes seem to change every generatio, but the structure
of what each tiling mode is good for stays really similar. As such
the core of the modifier is
 - the tiling mode
 - a version. Not explicitly a GPU generation, but splitting out
   a new set of tiling equations.

Sometimes one or two tiling modes stay the same and for those we
specify a canonical version.

Then we have a bunch of parameters on how the compression works.
Different HW units have different requirements for these and we
actually have some conflicts here.

e.g. the render backends need a specific alignment but the display
unit only works with unaligned compression surfaces. To work around
that we have a DCC_RETILE option where both an aligned and unaligned
compression surface are allocated and a writer has to sync the
aligned surface to the unaligned surface on handoff.

Finally there are some GPU parameters that participate in the tiling
equations. These are constant for each GPU on the rendering/texturing
side. The display unit is very flexible however and supports all
of them :|

Some estimates:
 - Single GPU, render+texture: ~10 modifiers
 - All possible configs in a gen, display: ~1000 modifiers
 - Configs of actually existing GPUs in a gen: ~100 modifiers

For formats with a single plane everything gets put in a separate
DRM plane. However, this doesn't fit for some YUV formats, so if
the format has >1 plane, we let the driver pack the surfaces into
1 DRM plane per format plane.

This way we avoid X11 rendering onto the frontbuffer with DCC, but
still fit into 4 DRM planes.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6176>
2020-11-13 03:27:28 +00:00
Bas Nieuwenhuizen
d4f7962d48 radeonsi: Add displayable DCC flushing without explicit flushes.
Flushes non-explicit shared textures that need retiling on

* glFlush
* glSync
* glSignalSemaphoreEXT
* DRI fences.
* The first time we create a non-explicit handle for it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6176>
2020-11-13 03:27:28 +00:00
Bas Nieuwenhuizen
3e2dcb3c07 amd/addrlib: Use signed char for INT_8.
Some architectures like aarch64 and ppc64el have char = unisgned char.
This breaks meta equation generation for DCC coords, as addrlib tries
to filter all the Z bits > -1 which ends up being all the Z bits > 255.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7593>
2020-11-13 02:37:54 +00:00
Bas Nieuwenhuizen
9acfbe3022 radv: Do the sample check for tiling earlier.
The LINEAR optimization is not allowed for MSAA images.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7594>
2020-11-13 02:56:07 +01:00
Vinson Lee
dad6b62576 turnip: Fix file descriptor return.
Fix defect reported by Coverity Scan.

Logically dead code (DEADCODE)
dead_error_line: Execution cannot reach the expression -1 inside this statement: return ret ? -1 : handle.fd;

Fixes: cec0bc73e5 ("turnip: rework fences to use syncobjs")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7498>
2020-11-12 22:32:23 +00:00
Marek Olšák
fe3b5241a4 radeonsi: enable GL_EXT_demote_to_helper_invocation
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7586>
2020-11-12 21:02:05 +00:00
Marek Olšák
aa757f4f8c ac/llvm: fix demote inside conditional branches
The big comment explains it.

v2: don't kill if subgroup ops are used

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7586>
2020-11-12 21:02:05 +00:00
Marek Olšák
cb20d58f45 nir: optimize nir_lower_discard_to_demote to lower discard/demote both ways
This is smarter and also lowers demote to discard if helper invocations are
not needed.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7586>
2020-11-12 21:02:05 +00:00
Marek Olšák
d5039f99b4 nir: gather shader_info::needs_all_helper_invocations
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7586>
2020-11-12 21:02:05 +00:00
Marek Olšák
baa5807e36 nir: rename needs_helper_invocations to needs_quad_helper_invocations
This indicates that only quad operations use helper invocations.
Also handle quad_swizzle_amd.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7586>
2020-11-12 21:02:05 +00:00
Marek Olšák
96c12b7dc2 nir: optionally shuffle local invocation IDs for compute quad derivatives
Used by radeonsi. local_invocation_index is lowered only when quad
derivatives are enabled.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7586>
2020-11-12 21:02:05 +00:00
Boyuan Zhang
99e17b0c4a radeon: fix license in header
Incorrect license was added previously.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7561>
2020-11-12 20:54:54 +00:00
Eric Anholt
0b4825c872 gallium/draw: Fix rasterizer_discard for wide points/lines.
Fixes the rasterizer_discard failures for softpipe, because the wide paths
(which we hit for points in the CTS) were dropping the discard state when
making the no_cull shadow state.

Cc: mesa-stable
Reviewed-by: Brian Paul <brianp@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7558>
2020-11-12 20:27:15 +00:00
Brendan Dougherty
9edb6e1be0 mesa: Fix vertex_format_to_pipe_format index.
Corrects the index into the vertex_formats table for `integer` and
`normalized` values other than 0 or 1.

Fixes: e6448f993b ("mesa: translate into gallium vertex formats in mesa/main")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7554>
2020-11-12 20:07:57 +00:00
Marcin Ślusarz
6e6dab4799 nir: handle float atomics in copy propagation pass
Without this patch, copy propagation pass can optimize out
buffer loads out of compare & swap loop, which then leads
to infinite loop.

Triggered by a change to atomicCompSwap float test in piglit.

Fixes: 8424cd8fbd ("nir: Account for atomics in copy propagation.")
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7538>
2020-11-12 19:20:50 +00:00
Rob Clark
8de279f8db freedreno/drm: Add some locking asserts
Also fix evil-twin table_lock which they turned up.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7580>
2020-11-12 18:14:56 +00:00
Rhys Perry
9eb2ae5d21 radv/winsys: set has_dedicated_vram in the null winsys
NGG is disabled if this is false.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7577>
2020-11-12 17:35:25 +00:00
Rob Clark
f6359d2dc3 nir: Fix nir_validate fail after nir_lower_tex
It is UB to initialize unions on the stack and rely on bits not covered
by the initialized union member to be zero.  Lets just simplify it and
move the entire nir_const_value off the stack.

While we're in there, sprinkle around some const.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3778
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7579>
2020-11-12 17:12:17 +00:00
Yuxuan Shui
53660e4c4e Add EGL xcb platform
This enables GL applications to be written without any involvement of
Xlib.

EGL X11 platform is actually already xcb-only underneath, so this commit
just add the necessary interface changes so eglDisplay can be created
from a xcb_connection_t.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6474>
2020-11-12 16:39:47 +00:00
Alexander Kanavin
8bb1a75b4f anv: fix a build race between generating a header and using it
anv_batch_chain.c includes genX_bits.h but doesn't ensure it gets
generated first. This causes build failures, as observed here:
https://autobuilder.yoctoproject.org/typhoon/#/builders/86/builds/1501/steps/8/logs/step2d

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Alexander Kanavin <alex.kanavin@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7412>
2020-11-12 15:57:05 +01:00
Erik Faye-Lund
5d2e9d76c1 d3d12: fix code after simple-shader helper changes
Fixes: 4e9328e3b6 ("nir_builder: Return a new builder from nir_builder_init_simple_shader().")
Fixes: 5f992802f5 ("nir/builder: Drop the mem_ctx arg from nir_builder_init_simple_shader().")
Fixes: eda3e4e055 ("nir/builder: Add a name format arg to nir_builder_init_simple_shader().")
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7574>
2020-11-12 14:32:56 +00:00
Michel Zou
5f99962540 zink: fix build on windows
guard the drm includes that are not available on this platform

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7432>
2020-11-12 15:16:13 +01:00
Samuel Pitoiset
db9d13b4ff aco: optimize v_add_u32(v_mul_lo_u16) -> v_mad_u32_u16
fossils-db (Vega10):
Totals from 779 (0.56% of 139517) affected shaders:
CodeSize: 1187928 -> 1187508 (-0.04%); split: -0.04%, +0.00%
Instrs: 247353 -> 244608 (-1.11%); split: -1.11%, +0.00%
Cycles: 1127472 -> 1116420 (-0.98%); split: -0.98%, +0.00%
VMEM: 139720 -> 138297 (-1.02%); split: +0.00%, -1.02%
SMEM: 51069 -> 50735 (-0.65%); split: +0.04%, -0.69%
Copies: 11548 -> 11547 (-0.01%); split: -0.03%, +0.03%

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7425>
2020-11-12 12:32:26 +00:00
Samuel Pitoiset
20e48551ac aco: select v_mul_lo_u16 for 16-bit multiplications that can't overflow
Only on GFX8-9 because GFX10 doesn't zero the upper 16 bits.

No fossils-db changes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7425>
2020-11-12 12:32:26 +00:00
Samuel Pitoiset
7028e9875f aco: select v_mad_u32_u16 for 16-bit multiplications on GFX9+
No fossils-db changes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7425>
2020-11-12 12:32:26 +00:00
Samuel Pitoiset
bbdafd6ab3 aco: optimize v_mad_u32_u16 with acc=0 to v_mul_u32_u24
v_mad_u32_u16 will be selected by isel to keep the range analysis
information around and to combine more v_add_u32+v_mad_u32_u16
together. When it's not possible to optimize that pattern, fallback
to v_mul_u32_u24 which is VOP2 instead of VOP3.

No fossils-db changes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7425>
2020-11-12 12:32:26 +00:00