Commit graph

2014 commits

Author SHA1 Message Date
Yonggang Luo
c74595ead3 radv/r600/clover: Getting libelf to be optional
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18503>
2022-09-22 05:07:35 +00:00
Timur Kristóf
2274b26dfb ac/nir/ngg: Don't initialize same-invocation mesh shader outputs.
This is actually not necessary and generates a lot of superfluous
instructions at every phi (setting the value to zero).

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18566>
2022-09-21 16:14:59 +00:00
Timur Kristóf
697ea02202 ac/nir/ngg: Don't use LDS for same-invocation indices and cull outputs.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18566>
2022-09-21 16:14:59 +00:00
Yonggang Luo
b70e92fe04 radv: Remove the redundant #include <gelf.h> and #include <libelf.h> in ac_binary.c
It's not access these two header in the source code

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18682>
2022-09-20 18:40:50 +00:00
Bas Nieuwenhuizen
266fe31666 ac/surface: Fix some warnings.
../mesa/src/amd/common/ac_surface.c:2324:48: warning: implicit conversion from enumeration type 'AddrResourceType' (aka 'enum _AddrResourceType') to different enumeration type 'enum gfx9_resource_type' [-Wenum-conversion]
   surf->u.gfx9.resource_type = AddrSurfInfoIn.resourceType;
                              ~ ~~~~~~~~~~~~~~~^~~~~~~~~~~~
../mesa/src/amd/common/ac_surface.c:3046:38: warning: implicit conversion from enumeration type 'const enum gfx9_resource_type' to different enumeration type 'AddrResourceType' (aka 'enum _AddrResourceType') [-Wenum-conversion]
   input.resourceType = surf->u.gfx9.resource_type;
                      ~ ~~~~~~~~~~~~~^~~~~~~~~~~~~
../mesa/src/amd/common/ac_surface.c:3069:38: warning: implicit conversion from enumeration type 'const enum gfx9_resource_type' to different enumeration type 'AddrResourceType' (aka 'enum _AddrResourceType') [-Wenum-conversion]
   input.resourceType = surf->u.gfx9.resource_type;

The enums are compatible so lets just add some casts.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18694>
2022-09-20 09:25:09 +00:00
Rhys Perry
7d26fafacf radv: fix dynamic RT stack size with VGPR spilling
VGPR spilling might cause VGPRs to be spilled at scratch offset 0, so we
can't use that.

fossil-db (Sienna Cichlid, Q2RTX and Control):
Totals from 4 (0.26% of 1524) affected shaders:
Instrs: 8734 -> 8737 (+0.03%)
CodeSize: 48492 -> 48504 (+0.02%)
Latency: 384375 -> 384369 (-0.00%)
InvThroughput: 256250 -> 256246 (-0.00%)
Copies: 1312 -> 1313 (+0.08%)
Branches: 256 -> 258 (+0.78%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18541>
2022-09-20 01:39:20 +00:00
James Park
b7d4897df9 meson,amd: Remove Windows libelf wrap
Functionality isn't worth the maintenance cost.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18478>
2022-09-19 12:51:12 +00:00
Qiang Yu
074f3216f2 ac/nir/ngg: support gs streamout
Port from radeonsi.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17654>
2022-09-16 08:51:28 +00:00
Qiang Yu
3fe8f88124 ac/nir/ngg: support multi stream per output slot for gs
radeonsi may pack multi stream output to same slot.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17654>
2022-09-16 08:51:28 +00:00
Qiang Yu
c25564b552 ac/nir/ngg: ngg_gs_load_out_vtx_primflag support stream
Streamout need primflag for any stream.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17654>
2022-09-16 08:51:28 +00:00
Qiang Yu
5ec79f9899 ac/nir/ngg: nogs support streamout
Port from radeonsi.

Works on both GFX11 and GFX10. Although GFX10 can do atomic
GDS add on all threads, now we just disable the NGG streamout
for GFX10, so it's OK.

There's a difference for the GFX11 implementation with radeonsi
that we do all 4 buffer/stream info calc on a single thread.
It's just because this is simple, we need to update GDS on a
single thread anyway, and streamout is not that performance
critical to loss a small amount of instruction. We may change
to a better implementation when using register based streamout.

When streamout enabled, ES threads need to save all vertex
attributes to LDS besides position. This is because we don't
know where in the streamout buffer to export the attributes to
and wheter there are space in the streamout buffer.

Streamout is done in primitives, so we need to check if there
is space and where the current primitive should be written to
by GDS atomic add, then in GS threads do the streamout.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17654>
2022-09-16 08:51:28 +00:00
Qiang Yu
30c7608516 ac/nir/ngg: cleanup prim id to prepare for streamout
Streamout also need barrier after culling, so move the
prim id barrier up to after culling.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17654>
2022-09-16 08:51:28 +00:00
James Zhu
fe8e18c782 amd/common: some ASICs with gfx9 use compute rings for render
Some ASICs with gfx9 use compute rings for render.

Fixes: 983223de5d - ac/gpu_info: use the kernel-reported
GFX IP version to set gfx_level

-v2: update merge requests num

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18553>
2022-09-12 16:24:37 +00:00
Samuel Pitoiset
8866e6582d radv: emit SQTT markers for RT related commands
This reports RT commands like vkCmdTraceRaysKHR and
vkCmdBuildAccelerationStructuresKHR in RGP.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18496>
2022-09-09 05:51:23 +00:00
Bas Nieuwenhuizen
ae7532e0cc amd/common: Disable DCC retile modifiers on RDNA1
Some claims of corruption, modifier-less Mesa already doesn't do
it. Since these modifiers have no purpose besides being displayed
lets just disable in Mesa.

Cc: mesa-stable

Tested-by: Michel Dänzer <mdaenzer@redhat.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18140>
2022-09-07 23:41:28 +00:00
Bas Nieuwenhuizen
af4b656817 amd/common: Don't rely on DCN support checks with modifiers.
Going to be a bad time if they disagree, which is bound to happen
sometimes. Not asserting and stuff tends to be a better experience
than crashing.

Cc: mesa-stable

Tested-by: Michel Dänzer <mdaenzer@redhat.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18140>
2022-09-07 23:41:28 +00:00
Timur Kristóf
c7ff93a766 ac/nir/ngg: Add EXT_mesh_shader vertex/primitive count.
In EXT_mesh_shader the vertex and primitive counts are set using a
built-in SetMeshOutputsEXT function.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18367>
2022-09-05 13:09:29 +00:00
Timur Kristóf
448d09d44a ac/nir/ngg: Add EXT_mesh_shader CullPrimitiveEXT output.
This is a per-primitive boolean output.
When set to 1, the primitive should be culled.

Implement this by using this boolean as the null primitive
flag for primitive exports.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18367>
2022-09-05 13:09:29 +00:00
Timur Kristóf
1f8f4570f0 ac/nir/ngg: Add EXT_mesh_shader primitive indices.
In EXT_mesh_shader the indices output is an array of vectors
which is indexed by the primitive index.

(They practically behave like a per-primitive output, although
technically the spec does not treat them as per-primitive.)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18367>
2022-09-05 13:09:29 +00:00
Marek Olšák
a6050a43ca ac/surface: disallow 256KB swizzle modes on gfx11 APUs
It doesn't work.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18340>
2022-09-05 08:59:59 +00:00
Marek Olšák
aef7ea868f ac/gpu_info: handle LPDDR4 and 5 in ac_memory_ops_per_clock
and update amdgpu_drm.h

Fixes: 50238f4958 - amd/common: Remove redundant code for determining memory ops per clock
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7163

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18340>
2022-09-05 08:59:59 +00:00
Rhys Perry
6a2ada93b4 ac: add ac_vtx_format_info
This will be used by RADV and ACO.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17894>
2022-08-30 19:02:11 +00:00
Rhys Perry
aa2d6e020b Revert "nir: Drop the unused instr arg for src/dest copy functions."
This reverts commit c3a0184118.

Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12910>
2022-08-30 18:21:44 +00:00
Marcin Ślusarz
3531c1e315 nir/lower_task_shader: print shader after each step
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17618>
2022-08-29 12:42:40 +00:00
Samuel Pitoiset
a04fd5c61f ac: constify ac_compute_cs_workgroup_size()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18210>
2022-08-26 14:07:09 +00:00
Qiang Yu
f75452918b ac/nir/ngg: support clipdist culling
Port from radeonsi.

Besides vertex position based primitive culling, clipdist
attribute can also be used to cull a primitive. Normally
it's used by fixed-pipeline, but when NGG we can treate it
as a culling condition to filter out invisible primitive
before fixed-pipeline.

There are two kinds of clipdist:
1. user define a clip plane explicitly by glClipPlane(),
   fixed-pipeline calculate with vertex position to get
   clipdist, then cull. This is the legacy way.
2. Now GLSL define gl_ClipDistance/gl_CullDiatance so that
   user can calculate clipdist in any way he like.

This implementation support both way.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17651>
2022-08-26 05:50:30 +00:00
Qiang Yu
620e62bb39 ac/nir/ngg: support component position store
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17651>
2022-08-26 05:50:30 +00:00
Qiang Yu
1bdeb961bd ac/nir/ngg: add gs culling
Port from radeonsi.

Cull primitive after GS thread and before final vertex/primitive
export. GS culling is like VS/TES culling which read out saved
vertex positions of a primitive from LDS then call the primitive
culling algorithm to check whether it's visiable or not, only
passed primitives will be exported.

Unlike the VS/TES culling that read vertex index of a primitive
from VGPRs as shader args, GS will set a primitive complete flag
for each last vertex of a primitive in LDS, so that vertex thread
know the previous 1/2/3 vertex can form a primitive and do primitive
culling.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17651>
2022-08-26 05:50:30 +00:00
Qiang Yu
b212fd4b1e ac/nir/ngg: save and restore position output base for nogs
radeonsi has different driver_location and io location.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17651>
2022-08-26 05:50:30 +00:00
Qiang Yu
7e17e01973 ac/nir/ngg: save and restore output bit size for gs
radeonsi does not have io nir variables, so need to save output
bit size when lower store_output intrinsic.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17651>
2022-08-26 05:50:30 +00:00
Qiang Yu
93a635c2c8 ac/nir/ngg: use same driver location for gs output
driver_location and io location are different for radeonsi,
and radeonsi llvm rely on the correct driver_location to
index output variables.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17651>
2022-08-26 05:50:30 +00:00
Qiang Yu
347a94666c ac/nir/ngg: fix and simplify gs store output lower
Simplify: 64bit IO has been lowered by nir_lower_io with
nir_lower_io_lower_64bit_to_32, so no need to handle in the
ngg lower.

Fix: we need to increase io_sem.location by base_offset for
correct gs_output_info.

radeonsi has different driver_location and io location, so
also change the output variable index to io location.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17651>
2022-08-26 05:50:30 +00:00
Qiang Yu
db0e9d3cab ac/nir/ngg: support line culling
Port from ac_llvm_cull.c

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17651>
2022-08-26 05:50:30 +00:00
Qiang Yu
f1f2c931a7 ac/nir/cull: support caller react when primitive is rejected
Make accept_func optional, and return accpect result for caller
react when primitive is rejected.

This is for GS culling.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17651>
2022-08-26 05:50:30 +00:00
Qiang Yu
035d70f721 ac/nir/ngg,radv: use nir_load_viewport_xy_scale_and_offset
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17651>
2022-08-26 05:50:30 +00:00
John Brooks
98ba1e0d81 radv: Fix mipmap views on GFX10+
As explained in the previous commit, GFX9+ has issues with addressing
mipmaps in block-compressed images. In the case of copy commands, we fix
this by doing an extra copy for the missing blocks.

For GFX10, the mipmap layout in memory allows us to do better than that. We
can change the base level of the descriptor to one level bigger than the
requested level and adjust the extent and address to match. This is done by
ComputeNonBlockCompressedView in addrlib. Thus on GFX10 we can skip the
fixup copy workaround, and this will also fix cases outside of explicit
copy commands.

Signed-off-by: John Brooks <john@fastquake.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17970>
2022-08-23 19:01:18 +00:00
John Brooks
35f053ba8c radv: Fix corrupted mipmap copies on GFX9+
GFX9+ hardware has an issue where mipmap degradations are calculated
incorrectly due to using divide-by-two integer math and certain mipmap
sizes lose blocks.

This issue has been documented before, and we ported a workaround from
AMDVLK to increase the extent that is programmed into the descriptor, so
that the hardware arrives at the correct result. However, this is
insufficient as we cannot safely increase the extent beyond the physical
extent of the image in memory. If we can't increase it enough, the image
will still be missing blocks.

But there is still hope. In cases where RADV is responsible for copying to
or from an image (such as vkCmdCopyBufferToImage/vkCmdCopyImageToBuffer),
we can perform a second copy of the blocks that the hardware excluded so
that the resulting image is complete. This is another workaround from
AMDVLK.

This fixes corrupted textures in Halo: The Master Chief Collection.

v2: Add RADV_CMD_FLAG_INV_L2 | RADV_CMD_FLAG_INV_VCACHE to flush_bits
    just in case (Samuel Pitoiset)

Closes: #3347

Signed-off-by: John Brooks <john@fastquake.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17970>
2022-08-23 19:01:17 +00:00
Friedrich Vock
50238f4958 amd/common: Remove redundant code for determining memory ops per clock
Fixes: 82fd379d9e ("amd/common: move ac_memory_ops_per_clock into ac_gpu_info.h")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18038>
2022-08-16 19:06:21 +00:00
Friedrich Vock
82fd379d9e amd/common: move ac_memory_ops_per_clock into ac_gpu_info.h
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17973>
2022-08-10 10:58:54 +00:00
Timur Kristóf
dccd6f495a ac/nir/cull: Fix typo in bounding box culling.
Bounding box culling is only viable when the W of all
vertices are positive. Always accept triangles whose any
W is negative.

Fixes: 0d527bb1aa
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7018
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17929>
2022-08-08 11:16:04 +00:00
Timur Kristóf
e035c289b5 ac/nir/cull: Tweak phi for cull_small_primitive branch.
cull_small_primitive will now allow the caller to pass an
SSA def that it will use to determine if the primitive was
initially rejected.

This allows ACO to remove an s_branch instruction from every
NGG culling shader.

Fossil DB stats on Navi 21:

Totals from 60918 (45.16% of 134906) affected shaders:
CodeSize: 160086644 -> 159355824 (-0.46%); split: -0.46%, +0.00%
Instrs: 30477916 -> 30356092 (-0.40%); split: -0.40%, +0.00%
Latency: 139587915 -> 139611487 (+0.02%); split: -0.00%, +0.02%
InvThroughput: 21184261 -> 21184346 (+0.00%)
Copies: 2762930 -> 2702024 (-2.20%); split: -2.20%, +0.00%
Branches: 1236970 -> 1176052 (-4.92%)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17919>
2022-08-06 20:43:36 +00:00
Timur Kristóf
e4b0caae61 ac/nir/cull: Make cull functions more consistent.
Now they all return whether the primitive was rejected.

No Fossil DB changes.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17870>
2022-08-05 22:10:28 +00:00
Timur Kristóf
c721f751f2 ac/nir/ngg: Move LDS store of accepted flag into the inner branch.
For primitives which are rejected based on only W and face, this
will reduce the number of executed branches.

Fossil DB stats on Navi 21:

Totals from 60918 (45.16% of 134906) affected shaders:
CodeSize: 160330564 -> 160086644 (-0.15%)
Instrs: 30477385 -> 30477916 (+0.00%); split: -0.00%, +0.00%
Latency: 139802763 -> 139587915 (-0.15%); split: -0.15%, +0.00%
InvThroughput: 21198444 -> 21184261 (-0.07%); split: -0.07%, +0.00%
SClause: 749811 -> 749810 (-0.00%)
Copies: 2701482 -> 2762930 (+2.27%); split: -0.00%, +2.28%

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17870>
2022-08-05 22:10:28 +00:00
Timur Kristóf
0d527bb1aa ac/nir/cull: Change if condition for bounding box culling.
The previous code checked all_w_positive in the if condition.
Instead, always execute the bbox culling code and include
all_w_positive at the end.

We assume checking in the if is not beneficial because it's
very unlikely that there is no primitive in a wave whose W are
not all positive.

This allows moving other things to the condition
in the next commit.

Fossil DB stats on Navi 21:

Totals from 60918 (45.16% of 134906) affected shaders:
CodeSize: 160574204 -> 160330564 (-0.15%); split: -0.15%, +0.00%
Instrs: 30538297 -> 30477385 (-0.20%); split: -0.20%, +0.00%
Latency: 139810902 -> 139802763 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 21198449 -> 21198444 (-0.00%); split: -0.00%, +0.00%
SClause: 749810 -> 749811 (+0.00%)
Copies: 2701474 -> 2701482 (+0.00%); split: -0.00%, +0.00%

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17870>
2022-08-05 22:10:28 +00:00
Timur Kristóf
fb4e68b724 ac/nir/cull: Move the contents of cull_bbox into ac_nir_cull_triangle.
No Fossil DB changes.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17870>
2022-08-05 22:10:28 +00:00
Timur Kristóf
e2ca24063a ac/nir/cull: Move some code from cull_bbox into helper functions.
No Fossil DB changes.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17870>
2022-08-05 22:10:27 +00:00
Marek Olšák
0c1801706e ac/llvm: handle external textures in ac_nir_lower_resinfo
Fixes: 4f622d62d0 - ac/nir: add ac_nir_lower_resinfo
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6993

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17902>
2022-08-05 09:04:17 +00:00
Marek Olšák
4f622d62d0 ac/nir: add ac_nir_lower_resinfo
Emulating image_get_resinfo should be faster than using the hw.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17693>
2022-08-03 17:44:15 +00:00
Marek Olšák
f129db911b radeonsi/gfx11: use a better workaround for the export conflict bug
This is recommended for better performance.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17864>
2022-08-03 00:57:16 +00:00
Marek Olšák
2ed9eb1b63 radeonsi/gfx11: enable shader prefetch except for initial chip revisions
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17864>
2022-08-03 00:57:16 +00:00