Commit graph

2030 commits

Author SHA1 Message Date
Qiang Yu
540eafada1 ac/nir/ngg: add streamout emitted primitive query
For radeonsi to implement GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17457>
2022-10-25 12:58:43 +00:00
Qiang Yu
188a7f9226 ac/nir/ngg: add query param to ac_nir_lower_ngg_gs
radeonsi may disable it. gfx_level will also be used by latter
vertex param export when gfx11.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17457>
2022-10-25 12:58:43 +00:00
Qiang Yu
a119a6464f nir,ac,radv: add primitive count add intrinsics
radeonsi use shader buffer, but radv use gds for the query
result storage.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17457>
2022-10-25 12:58:43 +00:00
Pierre-Eric Pelloux-Prayer
8034a71430 radeonsi/sqtt: re-export shaders in a single bo
RGP expects a pipeline's shaders to be all stored sequentially, eg:

  [vs][ps][gs]

As such, it assumes a single bo is dumped to the .rgp file, with
the following info:
  * va of the bo
  * offset to each shader inside the bo

For radeonsi, the shaders are stored individually, so we may have
a big gap between the shaders forming a pipeline => we can produce
very large file because the layout in the file must match the one
in memory (see the warning in ac_rgp_file_write_elf_text).

This commit implements a workaround: gfx shaders are re-exported as a
pipeline.

To update the shader address, a new state is created (sqtt_pipeline),
which will overwrite the needed _PGM_LO_* registers.

This reduces DeuxEX rgp captures from 150GB+ to less than 100MB.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18865>
2022-10-25 11:58:07 +00:00
Qiang Yu
7ee0b8b8df ac/nir/ngg,radv: use different counters for shader queries
VK_QUERY_TYPE_PRIMITIVES_GENERATED_EXT should count for each
stream.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7409
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19015>
2022-10-25 02:42:52 +00:00
Qiang Yu
83643e4dc8 nir,ac/nir/ngg,radv: split shader_query_enabled_amd
For used by different counter.

Vulkan:
1. VK_QUERY_PIPELINE_STATISTIC_GEOMETRY_SHADER_PRIMITIVES_BIT,
   sum generated primitives of all 4 streams when GS.
2. VK_QUERY_TYPE_PRIMITIVES_GENERATED_EXT, count generated
   primitives for all 4 streams when VS/TES/GS.
3. VK_QUERY_TYPE_TRANSFORM_FEEDBACK_STREAM_EXT, count generated
   and streamout primitives for all 4 streams when VS/TES/GS.

OpenGL:
1. GL_GEOMETRY_SHADER_PRIMITIVES_EMITTED_ARB, sum generated
   primitives for all 4 streams when GS.
2. GL_PRIMITIVES_GENERATED, count generated primitives for all 4
   streams when VS/TES/GS.
3. GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN, count streamout
   primitives for all 4 streams when VS/TES/GS.

pipeline_stat_query_enabled_amd is for Vulkan 1 and OpenGL 1.
xfb_query_enabled_amd is for Vulkan 2/3 and OpenGL 2/3.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19015>
2022-10-25 02:42:52 +00:00
Samuel Pitoiset
1ec5b6774d ac: fix has_vrs_ds_export_bug for VanGogh
Missed it.

Fixes: 0a8a9d9d63 ("ac: add radeon_info::has_vrs_ds_export_bug")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19272>
2022-10-24 12:01:52 +00:00
Pierre-Eric Pelloux-Prayer
0f00f74b20 ac/llvm: port functions to use ac_llvm_pointer
Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19035>
2022-10-21 07:56:38 +00:00
Samuel Pitoiset
0a8a9d9d63 ac: add radeon_info::has_vrs_ds_export_bug
According to PAL, only NAVI21 and NAVI22 are affected.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19195>
2022-10-21 06:45:21 +00:00
Timur Kristóf
e52c2f4fca nir, ac, aco: Add index src to load_buffer_amd/store_buffer_amd.
Also modify all existing uses to pass a zero to this new src.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> (nir)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17551>
2022-10-20 20:00:50 +00:00
Timur Kristóf
c918f0934e nir, ac, aco: Add ACCESS intrinsic index to load/store_buffer_amd.
Previously, we always treated these as coherent, but now let's make
this configurable. Also set all current users to ACCESS_COHERENT.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> (nir)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17551>
2022-10-20 20:00:49 +00:00
Qiang Yu
97e1613b0e ac/nir/ngg: use nir_load_provoking_vtx_in_prim_amd in ngg lower
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19166>
2022-10-20 06:53:56 +00:00
Timur Kristóf
6586afd6d2 ac/nir/tess: Remove jump from tess factor writes.
When the output patch size <= 32 we can be sure regardless
of wave size that each wave will take this branch, therefore
the jump can be removed.

Fossil DB stats on Navi 21:

Totals from 1385 (1.03% of 134906) affected shaders:
CodeSize: 2664436 -> 2658896 (-0.21%)
Instrs: 488618 -> 487233 (-0.28%)
Latency: 2290157 -> 2289199 (-0.04%)
InvThroughput: 898658 -> 898364 (-0.03%)
Branches: 6554 -> 5169 (-21.13%)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-By: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17921>
2022-10-11 15:42:54 +00:00
Timur Kristóf
892c15af64 ac/nir/ngg: Remove jumps from some branches where we know LGKMCNT==0.
The GPU can skip LDS instructions when LGKMCNT==0, and for these
branches this should be always faster than a jump.

Fossil DB stats on Navi 21:

Totals from 60918 (45.16% of 134906) affected shaders:
CodeSize: 158624792 -> 157893776 (-0.46%)
Instrs: 30234254 -> 30051500 (-0.60%)
Latency: 139521675 -> 139434597 (-0.06%); split: -0.06%, +0.00%
InvThroughput: 21184146 -> 21183653 (-0.00%); split: -0.00%, +0.00%
Branches: 1115134 -> 932380 (-16.39%)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-By: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17921>
2022-10-11 15:42:54 +00:00
James Zhu
3e2f7905a6 radeonsi/vcn: enable jpeg decode of yuv444 and yuv400
v2: set third plane offset only for 3 plane formats (Boyuan Zhang)

Signed-off-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18914>
2022-10-07 15:14:39 +00:00
Timur Kristóf
3ca8402ec7 ac/nir/ngg: Fix cross-invocation indices and cull outputs.
The layout calculation accidentally thought these would be
stored in variables, but that's not the case.

Fixes: 697ea02202
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18846>
2022-10-05 19:47:25 +00:00
Yonggang Luo
c74595ead3 radv/r600/clover: Getting libelf to be optional
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18503>
2022-09-22 05:07:35 +00:00
Timur Kristóf
2274b26dfb ac/nir/ngg: Don't initialize same-invocation mesh shader outputs.
This is actually not necessary and generates a lot of superfluous
instructions at every phi (setting the value to zero).

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18566>
2022-09-21 16:14:59 +00:00
Timur Kristóf
697ea02202 ac/nir/ngg: Don't use LDS for same-invocation indices and cull outputs.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18566>
2022-09-21 16:14:59 +00:00
Yonggang Luo
b70e92fe04 radv: Remove the redundant #include <gelf.h> and #include <libelf.h> in ac_binary.c
It's not access these two header in the source code

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18682>
2022-09-20 18:40:50 +00:00
Bas Nieuwenhuizen
266fe31666 ac/surface: Fix some warnings.
../mesa/src/amd/common/ac_surface.c:2324:48: warning: implicit conversion from enumeration type 'AddrResourceType' (aka 'enum _AddrResourceType') to different enumeration type 'enum gfx9_resource_type' [-Wenum-conversion]
   surf->u.gfx9.resource_type = AddrSurfInfoIn.resourceType;
                              ~ ~~~~~~~~~~~~~~~^~~~~~~~~~~~
../mesa/src/amd/common/ac_surface.c:3046:38: warning: implicit conversion from enumeration type 'const enum gfx9_resource_type' to different enumeration type 'AddrResourceType' (aka 'enum _AddrResourceType') [-Wenum-conversion]
   input.resourceType = surf->u.gfx9.resource_type;
                      ~ ~~~~~~~~~~~~~^~~~~~~~~~~~~
../mesa/src/amd/common/ac_surface.c:3069:38: warning: implicit conversion from enumeration type 'const enum gfx9_resource_type' to different enumeration type 'AddrResourceType' (aka 'enum _AddrResourceType') [-Wenum-conversion]
   input.resourceType = surf->u.gfx9.resource_type;

The enums are compatible so lets just add some casts.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18694>
2022-09-20 09:25:09 +00:00
Rhys Perry
7d26fafacf radv: fix dynamic RT stack size with VGPR spilling
VGPR spilling might cause VGPRs to be spilled at scratch offset 0, so we
can't use that.

fossil-db (Sienna Cichlid, Q2RTX and Control):
Totals from 4 (0.26% of 1524) affected shaders:
Instrs: 8734 -> 8737 (+0.03%)
CodeSize: 48492 -> 48504 (+0.02%)
Latency: 384375 -> 384369 (-0.00%)
InvThroughput: 256250 -> 256246 (-0.00%)
Copies: 1312 -> 1313 (+0.08%)
Branches: 256 -> 258 (+0.78%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18541>
2022-09-20 01:39:20 +00:00
James Park
b7d4897df9 meson,amd: Remove Windows libelf wrap
Functionality isn't worth the maintenance cost.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18478>
2022-09-19 12:51:12 +00:00
Qiang Yu
074f3216f2 ac/nir/ngg: support gs streamout
Port from radeonsi.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17654>
2022-09-16 08:51:28 +00:00
Qiang Yu
3fe8f88124 ac/nir/ngg: support multi stream per output slot for gs
radeonsi may pack multi stream output to same slot.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17654>
2022-09-16 08:51:28 +00:00
Qiang Yu
c25564b552 ac/nir/ngg: ngg_gs_load_out_vtx_primflag support stream
Streamout need primflag for any stream.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17654>
2022-09-16 08:51:28 +00:00
Qiang Yu
5ec79f9899 ac/nir/ngg: nogs support streamout
Port from radeonsi.

Works on both GFX11 and GFX10. Although GFX10 can do atomic
GDS add on all threads, now we just disable the NGG streamout
for GFX10, so it's OK.

There's a difference for the GFX11 implementation with radeonsi
that we do all 4 buffer/stream info calc on a single thread.
It's just because this is simple, we need to update GDS on a
single thread anyway, and streamout is not that performance
critical to loss a small amount of instruction. We may change
to a better implementation when using register based streamout.

When streamout enabled, ES threads need to save all vertex
attributes to LDS besides position. This is because we don't
know where in the streamout buffer to export the attributes to
and wheter there are space in the streamout buffer.

Streamout is done in primitives, so we need to check if there
is space and where the current primitive should be written to
by GDS atomic add, then in GS threads do the streamout.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17654>
2022-09-16 08:51:28 +00:00
Qiang Yu
30c7608516 ac/nir/ngg: cleanup prim id to prepare for streamout
Streamout also need barrier after culling, so move the
prim id barrier up to after culling.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17654>
2022-09-16 08:51:28 +00:00
James Zhu
fe8e18c782 amd/common: some ASICs with gfx9 use compute rings for render
Some ASICs with gfx9 use compute rings for render.

Fixes: 983223de5d - ac/gpu_info: use the kernel-reported
GFX IP version to set gfx_level

-v2: update merge requests num

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18553>
2022-09-12 16:24:37 +00:00
Samuel Pitoiset
8866e6582d radv: emit SQTT markers for RT related commands
This reports RT commands like vkCmdTraceRaysKHR and
vkCmdBuildAccelerationStructuresKHR in RGP.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18496>
2022-09-09 05:51:23 +00:00
Bas Nieuwenhuizen
ae7532e0cc amd/common: Disable DCC retile modifiers on RDNA1
Some claims of corruption, modifier-less Mesa already doesn't do
it. Since these modifiers have no purpose besides being displayed
lets just disable in Mesa.

Cc: mesa-stable

Tested-by: Michel Dänzer <mdaenzer@redhat.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18140>
2022-09-07 23:41:28 +00:00
Bas Nieuwenhuizen
af4b656817 amd/common: Don't rely on DCN support checks with modifiers.
Going to be a bad time if they disagree, which is bound to happen
sometimes. Not asserting and stuff tends to be a better experience
than crashing.

Cc: mesa-stable

Tested-by: Michel Dänzer <mdaenzer@redhat.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18140>
2022-09-07 23:41:28 +00:00
Timur Kristóf
c7ff93a766 ac/nir/ngg: Add EXT_mesh_shader vertex/primitive count.
In EXT_mesh_shader the vertex and primitive counts are set using a
built-in SetMeshOutputsEXT function.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18367>
2022-09-05 13:09:29 +00:00
Timur Kristóf
448d09d44a ac/nir/ngg: Add EXT_mesh_shader CullPrimitiveEXT output.
This is a per-primitive boolean output.
When set to 1, the primitive should be culled.

Implement this by using this boolean as the null primitive
flag for primitive exports.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18367>
2022-09-05 13:09:29 +00:00
Timur Kristóf
1f8f4570f0 ac/nir/ngg: Add EXT_mesh_shader primitive indices.
In EXT_mesh_shader the indices output is an array of vectors
which is indexed by the primitive index.

(They practically behave like a per-primitive output, although
technically the spec does not treat them as per-primitive.)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18367>
2022-09-05 13:09:29 +00:00
Marek Olšák
a6050a43ca ac/surface: disallow 256KB swizzle modes on gfx11 APUs
It doesn't work.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18340>
2022-09-05 08:59:59 +00:00
Marek Olšák
aef7ea868f ac/gpu_info: handle LPDDR4 and 5 in ac_memory_ops_per_clock
and update amdgpu_drm.h

Fixes: 50238f4958 - amd/common: Remove redundant code for determining memory ops per clock
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7163

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18340>
2022-09-05 08:59:59 +00:00
Rhys Perry
6a2ada93b4 ac: add ac_vtx_format_info
This will be used by RADV and ACO.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17894>
2022-08-30 19:02:11 +00:00
Rhys Perry
aa2d6e020b Revert "nir: Drop the unused instr arg for src/dest copy functions."
This reverts commit c3a0184118.

Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12910>
2022-08-30 18:21:44 +00:00
Marcin Ślusarz
3531c1e315 nir/lower_task_shader: print shader after each step
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17618>
2022-08-29 12:42:40 +00:00
Samuel Pitoiset
a04fd5c61f ac: constify ac_compute_cs_workgroup_size()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18210>
2022-08-26 14:07:09 +00:00
Qiang Yu
f75452918b ac/nir/ngg: support clipdist culling
Port from radeonsi.

Besides vertex position based primitive culling, clipdist
attribute can also be used to cull a primitive. Normally
it's used by fixed-pipeline, but when NGG we can treate it
as a culling condition to filter out invisible primitive
before fixed-pipeline.

There are two kinds of clipdist:
1. user define a clip plane explicitly by glClipPlane(),
   fixed-pipeline calculate with vertex position to get
   clipdist, then cull. This is the legacy way.
2. Now GLSL define gl_ClipDistance/gl_CullDiatance so that
   user can calculate clipdist in any way he like.

This implementation support both way.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17651>
2022-08-26 05:50:30 +00:00
Qiang Yu
620e62bb39 ac/nir/ngg: support component position store
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17651>
2022-08-26 05:50:30 +00:00
Qiang Yu
1bdeb961bd ac/nir/ngg: add gs culling
Port from radeonsi.

Cull primitive after GS thread and before final vertex/primitive
export. GS culling is like VS/TES culling which read out saved
vertex positions of a primitive from LDS then call the primitive
culling algorithm to check whether it's visiable or not, only
passed primitives will be exported.

Unlike the VS/TES culling that read vertex index of a primitive
from VGPRs as shader args, GS will set a primitive complete flag
for each last vertex of a primitive in LDS, so that vertex thread
know the previous 1/2/3 vertex can form a primitive and do primitive
culling.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17651>
2022-08-26 05:50:30 +00:00
Qiang Yu
b212fd4b1e ac/nir/ngg: save and restore position output base for nogs
radeonsi has different driver_location and io location.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17651>
2022-08-26 05:50:30 +00:00
Qiang Yu
7e17e01973 ac/nir/ngg: save and restore output bit size for gs
radeonsi does not have io nir variables, so need to save output
bit size when lower store_output intrinsic.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17651>
2022-08-26 05:50:30 +00:00
Qiang Yu
93a635c2c8 ac/nir/ngg: use same driver location for gs output
driver_location and io location are different for radeonsi,
and radeonsi llvm rely on the correct driver_location to
index output variables.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17651>
2022-08-26 05:50:30 +00:00
Qiang Yu
347a94666c ac/nir/ngg: fix and simplify gs store output lower
Simplify: 64bit IO has been lowered by nir_lower_io with
nir_lower_io_lower_64bit_to_32, so no need to handle in the
ngg lower.

Fix: we need to increase io_sem.location by base_offset for
correct gs_output_info.

radeonsi has different driver_location and io location, so
also change the output variable index to io location.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17651>
2022-08-26 05:50:30 +00:00
Qiang Yu
db0e9d3cab ac/nir/ngg: support line culling
Port from ac_llvm_cull.c

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17651>
2022-08-26 05:50:30 +00:00
Qiang Yu
f1f2c931a7 ac/nir/cull: support caller react when primitive is rejected
Make accept_func optional, and return accpect result for caller
react when primitive is rejected.

This is for GS culling.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17651>
2022-08-26 05:50:30 +00:00