Commit graph

15571 commits

Author SHA1 Message Date
Georg Lehmann
dedfff9dbf aco: only set latekill in live_var_analysis
Cleaner to have this all in one place, in my opinion.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30368>
2024-08-12 10:31:09 +00:00
Georg Lehmann
510f5e55be aco/gfx10+: set lateKill for sgprs used by wave64 VALU writing a mask
RDNA2 ISA doc, 6.2.4. Wave64 Destination Restrictions:
The first pass of a wave64 VALU instruction may not overwrite a scalar value
used by the second half.

Foz-DB Navi31:
Totals from 5221 (6.58% of 79395) affected shaders:
Instrs: 9751484 -> 9752179 (+0.01%); split: -0.01%, +0.01%
CodeSize: 50624072 -> 50626088 (+0.00%); split: -0.00%, +0.01%
Latency: 85646450 -> 85647419 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 15039160 -> 15039277 (+0.00%); split: -0.00%, +0.00%
VClause: 200275 -> 200204 (-0.04%)
SClause: 248645 -> 248607 (-0.02%); split: -0.03%, +0.01%
Copies: 640802 -> 641413 (+0.10%); split: -0.01%, +0.11%
PreSGPRs: 236297 -> 236735 (+0.19%)
VALU: 5666449 -> 5666440 (-0.00%)
SALU: 967482 -> 968111 (+0.07%); split: -0.01%, +0.07%

Cc: mesa-stable

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30368>
2024-08-12 10:31:09 +00:00
Collabora's Gfx CI Team
d330870f9c Uprev Piglit to f11abb664bfcad09586f32f411b90331e23be2e5
0453436872...f11abb664b

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30597>
2024-08-10 07:39:41 +00:00
David Heidelberg
bda1a0596e meson/addrlib: allow unintialized callbacks
Resolves:
[328/4125] Compiling C++ object src/amd/addrlib/libaddrlib.a.p/src_core_addrlib1.cpp.o
In static member function 'static VOID Addr::Object::ClientFree(VOID*, const Addr::Client*)',
    inlined from 'static VOID Addr::Object::operator delete(VOID*)' at ../src/amd/addrlib/src/core/addrobject.cpp:190:15,
    inlined from 'virtual Addr::Object::~Object()' at ../src/amd/addrlib/src/core/addrobject.cpp:71:1:
../src/amd/addrlib/src/core/addrobject.cpp:129:28: error: '*(const Addr::Client*)((char*)this + 8).Addr::Client::callbacks._ADDR_CALLBACKS::freeSysMem' is used uninitialized [-Werror=uninitialized]
  129 |     if (pClient->callbacks.freeSysMem != NULL)
      |         ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11634

Suggested-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: David Heidelberg <david@ixit.cz>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30482>
2024-08-10 15:13:33 +09:00
Marek Olšák
07554d32db ac/nir: adjust gfx11 tuning for the compute blit
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30208>
2024-08-10 02:14:44 +00:00
Marek Olšák
db7823e8b9 ac/nir: adjust performance-related decisions for clear/copy_buffer shader
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30208>
2024-08-10 02:14:44 +00:00
Marek Olšák
361266fec7 ac/nir: import the clear/copy_buffer compute shader from radeonsi
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30208>
2024-08-10 02:14:44 +00:00
Friedrich Vock
70fc5987d4 radv/rt: Don't atomicAdd local prefix sums
This only gets written to by one invocation at a time.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30483>
2024-08-09 18:12:52 +00:00
Friedrich Vock
a3df3ebab4 radv/rt: Only do ploc atomicCompSwap once per workgroup
There is no need to do this for every invocation in the wave.

Improves GravityMark scores by 5%.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30483>
2024-08-09 18:12:52 +00:00
Georg Lehmann
df16f47036 aco: optimize dd[xy]_fine if it's only used by abs
If we can ignore the sign of the derivative, we can swap the lanes
instead of broadcasting per direction. abs(a - b) = abs(b - a).
Shamelessly copied from bifrost.

Foz-DB Navi31:
Totals from 5 (0.01% of 79206) affected shaders:
Instrs: 6191 -> 6184 (-0.11%)
CodeSize: 31960 -> 31920 (-0.13%)
Latency: 111961 -> 111926 (-0.03%)
InvThroughput: 10390 -> 10372 (-0.17%)
VALU: 3286 -> 3279 (-0.21%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30557>
2024-08-08 17:39:55 +00:00
Timur Kristóf
f317311bad ac/nir: Shorten the name of ac_nir_calc_io_offset_mapped.
The other variant of this function doesn't
exist anymore, so there is no ambiguity.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29812>
2024-08-08 16:55:02 +00:00
Timur Kristóf
c9b5ef0e53 ac/nir/tess: Simplify calculation of HS output LDS offset.
No functional change, just make the code more readable.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29812>
2024-08-08 16:55:02 +00:00
Timur Kristóf
f917b81665 radv: Stop assigning linked driver locations.
The ac/nir passes are now able to map memory locations for the I/O
intrinsics without relying on driver location (intrinsic base).

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29812>
2024-08-08 16:55:02 +00:00
Timur Kristóf
d43466e917 ac/nir: Remove ac_nir_calc_io_offset function.
This function is not used anymore, because none of the callers
rely on driver locations (intrinsic base) anymore.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29812>
2024-08-08 16:55:02 +00:00
Timur Kristóf
7c009172e3 ac/nir/esgs: Map linked ES/GS I/O based on GS input mask.
With this change, ES/GS	linking	will not rely on driver	locations
anymore (driver locations are considered deprecated in NIR).

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29812>
2024-08-08 16:55:02 +00:00
Timur Kristóf
d758bea8dd ac/nir/tess: Map linked LS/HS I/O based on TCS input mask.
With this change, LS/HS linking will not rely on driver locations
anymore (driver locations are considered deprecated in NIR).

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29812>
2024-08-08 16:55:02 +00:00
Timur Kristóf
b162c7962f ac/nir: Add helper for I/O location mapping.
Map I/O locations based on a prefix sum (for linked shaders),
or based on the provided callback.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29812>
2024-08-08 16:55:02 +00:00
Timur Kristóf
ed6499db6b ac/nir/esgs: Don't emit ES outputs that aren't read by GS.
This is	necessary to prevent a regression from a future	commit,
in order to allow us to	map output locations differently.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29812>
2024-08-08 16:55:02 +00:00
Timur Kristóf
6d83389a39 ac/nir/esgs: Add gs_inputs_read to ES output lowering.
This commit just adds the field, it will be taken into use
in a following commit.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29812>
2024-08-08 16:55:02 +00:00
Timur Kristóf
9aa5c38e8d ac/nir/tess: Don't emit VS outputs that aren't read by TCS.
This is necessary to prevent a regression from a future commit,
in order to allow us to map output locations differently.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29812>
2024-08-08 16:55:02 +00:00
Timur Kristóf
b5f53fdf32 ac/nir/tess: Add tcs_inputs_read to LS output lowering.
This commit just adds the field, it will be taken into use
in a following commit.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29812>
2024-08-08 16:55:02 +00:00
Timur Kristóf
a8d78f889e radv: Add gs/hs_inputs_read field for linked LS and ES.
This will be used in the following commits to determine the
memory locations for LS/HS and ES/GS linking.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29812>
2024-08-08 16:55:02 +00:00
Alyssa Rosenzweig
daa97bb41a amd: switch to derivative intrinsics
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30565>
2024-08-08 15:26:07 +00:00
Alyssa Rosenzweig
048173a55a radv: use glsl function name for dFdxfine
since fddx isn't a name used anywhere now

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30565>
2024-08-08 15:26:07 +00:00
Alyssa Rosenzweig
530498cb83 treewide: use new-style derivative builders
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30565>
2024-08-08 15:26:07 +00:00
Georg Lehmann
f36fccabf5 aco: optimize 64bit find_lsb/find_msb
No Foz-DB changes, but this should be better, especially for gfx6-7 where
uadd_sat is 2 instructions.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30549>
2024-08-08 11:15:55 +00:00
Collabora's Gfx CI Team
cf2b156f2e Uprev Piglit to 0453436872b6e4d502c2e87817addb95e0d77e3b
4a62c26721...0453436872

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30559>
2024-08-08 02:25:45 +00:00
Marek Olšák
97d664b22f ac/surface/gfx12: turn off HiZ for pre-production samples
Fixes: f703dfd1bb - radeonsi: add gfx12

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30525>
2024-08-07 20:35:18 +00:00
Agate, Jesse
282ad9d864 amd/vpelib: Refactor frontend and backend config callback
Refactor and rename frontend and backend config callback.

Reviewed-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Acked-by: Jack Chih <chiachih@amd.com>
Signed-off-by: Jesse Agate <Jesse.Agate@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30531>
2024-08-07 16:46:25 +00:00
Alan Liu
4886ee5caf amd/vpelib: Amend log for tone map support check
[Why & How]
Amend the log when failing to support tone mapping.

Reviewed-by: Tomson Chang <tomson.chang@amd.com>
Reviewed-by: Jude Shih <Jude.Shih@amd.com>
Acked-by: Jack Chih <chiachih@amd.com>
Signed-off-by: Alan Liu <haoping.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30531>
2024-08-07 16:46:25 +00:00
Kovac, Krunoslav
c5e2c4feaf amd/vpelib: Refactor MPC registers
Refactor MPC registers.
3DLUT programming is largely the same but register are renamed to be in
VPMPC_RMCM (as opposed to VPMPCC_MCM). Note that they are still inside
MCM so governed by MCM control location.

Reviewed-by: Roy Chan <Roy.Chan@amd.com>
Acked-by: Jack Chih <chiachih@amd.com>
Signed-off-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30531>
2024-08-07 16:46:25 +00:00
Agate, Jesse
63d8fa3f28 amd/vpelib: Refactor structs for API change
Refactor structs for API change.

Reviewed-by: Roy Chan <Roy.Chan@amd.com>
Acked-by: Jack Chih <chiachih@amd.com>
Signed-off-by: Jesse Agate <Jesse.Agate@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30531>
2024-08-07 16:46:25 +00:00
Hsieh, Mike
5e3b3ed8f7 amd/vpelib: Refactor OPP registers
Refactor OPP registers.

---------

Reviewed-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Acked-by: Jack Chih <chiachih@amd.com>
Signed-off-by: Mike Hsieh <Mike.Hsieh@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30531>
2024-08-07 16:46:25 +00:00
Kovac, Krunoslav
914eb0a212 amd/vpelib: MPC refactoring HW registers
In order to be able to share HW registers, some refactoring
is needed.

Reviewed-by: Roy Chan <Roy.Chan@amd.com>
Reviewed-by: Tomson Chang <tomson.chang@amd.com>
Acked-by: Jack Chih <chiachih@amd.com>
Signed-off-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30531>
2024-08-07 16:46:25 +00:00
Assadian, Navid
a76d1aa565 amd/vpelib: Fix whitepoint for geometric downscaling
Fix whitepoint for geometric down scaling.

---------

Reviewed-by: Roy Chan <Roy.Chan@amd.com>
Acked-by: Jack Chih <chiachih@amd.com>
Signed-off-by: Navid Assadian <navid.assadian@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30531>
2024-08-07 16:46:25 +00:00
Visan, Tiberiu
30a28b76c8 amd/vpelib: set the same range for clr adj
Change the range for color adjustments and also modify bright cap.

Reviewed-by: Roy Chan <Roy.Chan@amd.com>
Acked-by: Jack Chih <chiachih@amd.com>
Signed-off-by: Tiberiu Visan <Tiberiu.Visan@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30531>
2024-08-07 16:46:25 +00:00
Assadian, Navid
e1ef91ac2a amd/vpelib: Fix CS translation for geometric downscaling
Geometric downscaling uses RGB10 as the intermediate format. The support for P601 and JFIF with RGB formats is added.

Co-authored-by: Roy Chan <roy.chan@amd.com>
Reviewed-by: Roy Chan <Roy.Chan@amd.com>
Acked-by: Jack Chih <chiachih@amd.com>
Signed-off-by: Navid Assadian <navid.assadian@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30531>
2024-08-07 16:46:25 +00:00
Assadian, Navid
699f88f844 amd/vpelib: Add API function to get taps
A module to calculate the number of taps is added to the API.
Additionally, the get_optimal_taps module is moved from dpp to resource.

Reviewed-by: Roy Chan <Roy.Chan@amd.com>
Acked-by: Jack Chih <chiachih@amd.com>
Signed-off-by: Navid Assadian <navid.assadian@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30531>
2024-08-07 16:46:25 +00:00
Assadian, Navid
4fc221524c amd/vpelib: Change Max DS support to 4:1
Since VPE can use upto 8 taps, for quality purpose vpelib cannot support
downscaling ratio more than 4:1. The caps value needed to be modified to
reject this case earlier.

Reviewed-by: Roy Chan <Roy.Chan@amd.com>
Acked-by: Jack Chih <chiachih@amd.com>
Signed-off-by: Navid Assadian <navid.assadian@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30531>
2024-08-07 16:46:25 +00:00
Kovac, Krunoslav
e6dd0de4d9 amd/vpelib: DPP starting changes
Refactor DPP registers to split into common and version specific.

Gamut remap for DPP will likely move to MPC. For this, we need MPC changes
and refactor program_front_end/back_end so the correct block does it.

Reviewed-by: Roy Chan <Roy.Chan@amd.com>
Acked-by: Jack Chih <chiachih@amd.com>
Signed-off-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30531>
2024-08-07 16:46:25 +00:00
Lin, Ricky
54d1d41e10 amd/vpelib: Added JFIF format to RGB output side
Added JFIF format to RGB output side, due to geometric scaling will
change cs parameter to JFIF.

---------

Reviewed-by: Tomson Chang <tomson.chang@amd.com>
Acked-by: Jack Chih <chiachih@amd.com>
Signed-off-by: rickylin <ricky.lin@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30531>
2024-08-07 16:46:25 +00:00
Hsieh, Mike
746556d585 amd/vpelib: Remove deprecated update_3dlut flag
[WHY & HOW]
update_3dlut flag has been replaced by UID mechanism.
Remove update_3dlut flag and update related functions.

Reviewed-by: Jesse Agate <jesse.agate@amd.com>
Reviewed-by: Tomson Chang <tomson.chang@amd.com>
Acked-by: Jack Chih <chiachih@amd.com>
Signed-off-by: Mike Hsieh <Mike.Hsieh@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30531>
2024-08-07 16:46:25 +00:00
Zan Dobersek
7fd5f76393 nir/lower_vars_to_scratch: calculate threshold-limited variable size separately
ir3's lowering of variables to scratch memory has to treat 8-bit values as
16-bit ones when comparing such value's size against the given threshold
since those values are handled through 16-bit half-registers. But those
values can still use natural 8-bit size and alignment for storing inside
scratch memory.

nir_lower_vars_to_scratch now accepts two size-and-alignment functions,
one used for calculating the variable size and the other for calculating
the size and alignment needed for storing inside scratch memory. Non-ir3
uses of this pass can just duplicate the currently-used function. ir3
provides a separate variable-size function that special-cases 8-bit types.

Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29875>
2024-08-07 14:32:28 +00:00
Georg Lehmann
d9849ac466 aco: test xor swap16 path
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30515>
2024-08-06 20:40:12 +00:00
Georg Lehmann
e0818cb87b aco/gfx11+: don't use VOP3 v_swap_b16
v_swap_b16 is not offically supported as VOP3, so it can't be used with v128-255.
Tests show that VOP3 appears to work correctly, but according to AMD that should
not be relied on.

https://github.com/llvm/llvm-project/pull/100442#discussion_r1703929676

Foz-DB Navi31:
Totals from 6 (0.01% of 79395) affected shaders:
Instrs: 64799 -> 65932 (+1.75%)
CodeSize: 360180 -> 368440 (+2.29%)
Latency: 1364648 -> 1365922 (+0.09%)
InvThroughput: 635843 -> 636475 (+0.10%)
Copies: 14766 -> 15698 (+6.31%)
VALU: 38743 -> 39675 (+2.41%)

Fixes: 80b8bbf0c5 ("aco/gfx11: use v_swap_b16")

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30515>
2024-08-06 20:40:12 +00:00
Alyssa Rosenzweig
d99c2ef059 nir/opt_uniform_atomics: add fs atomics predicated? flag
on agx (and mali), we predicate atomics on "if (!helper)", so doing so again in
this pass is redundant. and would cause a problem since we'd then have to lower
the "is helper inv?" flag late. so just skip the extra lowering code.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30488>
2024-08-06 11:48:17 -04:00
Marek Olšák
de83b5ef77 ac/surface/gfx12: fix setting tile_swizzle
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30503>
2024-08-05 19:35:39 +00:00
Collabora's Gfx CI Team
1d35b2f343 Uprev Piglit to 4a62c26721a47552a96416a134b789a813dd51a6
582f5490a1...4a62c26721

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30254>
2024-08-05 10:45:38 +00:00
Rhys Perry
8f3d0fbad7 aco: don't transform v_interp_p2_f32 with constant into fma
Since v_interp_p2_f32 with constant operands only happens on GFX11.5, this
should actually be fine in all cases where this is currently possible
(GFX11.5+ allows DPP with scalar src1). However, it does fail validation
because we haven't updated that yet.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Fixes: bee487df48 ("aco/gfx11.5+: use vinterp for fddx/fddy")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30477>
2024-08-05 09:32:24 +00:00
Rhys Perry
911fdce0b6 aco: fix validation of v_s_ opcodes
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Fixes: 284b9965e8 ("aco/gfx11.5+: allow sgpr dst for trans ops and use pseudo scalar ops on gfx12")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30477>
2024-08-05 09:32:24 +00:00