Commit graph

1980 commits

Author SHA1 Message Date
Qiang Yu
035d70f721 ac/nir/ngg,radv: use nir_load_viewport_xy_scale_and_offset
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17651>
2022-08-26 05:50:30 +00:00
John Brooks
98ba1e0d81 radv: Fix mipmap views on GFX10+
As explained in the previous commit, GFX9+ has issues with addressing
mipmaps in block-compressed images. In the case of copy commands, we fix
this by doing an extra copy for the missing blocks.

For GFX10, the mipmap layout in memory allows us to do better than that. We
can change the base level of the descriptor to one level bigger than the
requested level and adjust the extent and address to match. This is done by
ComputeNonBlockCompressedView in addrlib. Thus on GFX10 we can skip the
fixup copy workaround, and this will also fix cases outside of explicit
copy commands.

Signed-off-by: John Brooks <john@fastquake.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17970>
2022-08-23 19:01:18 +00:00
John Brooks
35f053ba8c radv: Fix corrupted mipmap copies on GFX9+
GFX9+ hardware has an issue where mipmap degradations are calculated
incorrectly due to using divide-by-two integer math and certain mipmap
sizes lose blocks.

This issue has been documented before, and we ported a workaround from
AMDVLK to increase the extent that is programmed into the descriptor, so
that the hardware arrives at the correct result. However, this is
insufficient as we cannot safely increase the extent beyond the physical
extent of the image in memory. If we can't increase it enough, the image
will still be missing blocks.

But there is still hope. In cases where RADV is responsible for copying to
or from an image (such as vkCmdCopyBufferToImage/vkCmdCopyImageToBuffer),
we can perform a second copy of the blocks that the hardware excluded so
that the resulting image is complete. This is another workaround from
AMDVLK.

This fixes corrupted textures in Halo: The Master Chief Collection.

v2: Add RADV_CMD_FLAG_INV_L2 | RADV_CMD_FLAG_INV_VCACHE to flush_bits
    just in case (Samuel Pitoiset)

Closes: #3347

Signed-off-by: John Brooks <john@fastquake.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17970>
2022-08-23 19:01:17 +00:00
Friedrich Vock
50238f4958 amd/common: Remove redundant code for determining memory ops per clock
Fixes: 82fd379d9e ("amd/common: move ac_memory_ops_per_clock into ac_gpu_info.h")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18038>
2022-08-16 19:06:21 +00:00
Friedrich Vock
82fd379d9e amd/common: move ac_memory_ops_per_clock into ac_gpu_info.h
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17973>
2022-08-10 10:58:54 +00:00
Timur Kristóf
dccd6f495a ac/nir/cull: Fix typo in bounding box culling.
Bounding box culling is only viable when the W of all
vertices are positive. Always accept triangles whose any
W is negative.

Fixes: 0d527bb1aa
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7018
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17929>
2022-08-08 11:16:04 +00:00
Timur Kristóf
e035c289b5 ac/nir/cull: Tweak phi for cull_small_primitive branch.
cull_small_primitive will now allow the caller to pass an
SSA def that it will use to determine if the primitive was
initially rejected.

This allows ACO to remove an s_branch instruction from every
NGG culling shader.

Fossil DB stats on Navi 21:

Totals from 60918 (45.16% of 134906) affected shaders:
CodeSize: 160086644 -> 159355824 (-0.46%); split: -0.46%, +0.00%
Instrs: 30477916 -> 30356092 (-0.40%); split: -0.40%, +0.00%
Latency: 139587915 -> 139611487 (+0.02%); split: -0.00%, +0.02%
InvThroughput: 21184261 -> 21184346 (+0.00%)
Copies: 2762930 -> 2702024 (-2.20%); split: -2.20%, +0.00%
Branches: 1236970 -> 1176052 (-4.92%)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17919>
2022-08-06 20:43:36 +00:00
Timur Kristóf
e4b0caae61 ac/nir/cull: Make cull functions more consistent.
Now they all return whether the primitive was rejected.

No Fossil DB changes.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17870>
2022-08-05 22:10:28 +00:00
Timur Kristóf
c721f751f2 ac/nir/ngg: Move LDS store of accepted flag into the inner branch.
For primitives which are rejected based on only W and face, this
will reduce the number of executed branches.

Fossil DB stats on Navi 21:

Totals from 60918 (45.16% of 134906) affected shaders:
CodeSize: 160330564 -> 160086644 (-0.15%)
Instrs: 30477385 -> 30477916 (+0.00%); split: -0.00%, +0.00%
Latency: 139802763 -> 139587915 (-0.15%); split: -0.15%, +0.00%
InvThroughput: 21198444 -> 21184261 (-0.07%); split: -0.07%, +0.00%
SClause: 749811 -> 749810 (-0.00%)
Copies: 2701482 -> 2762930 (+2.27%); split: -0.00%, +2.28%

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17870>
2022-08-05 22:10:28 +00:00
Timur Kristóf
0d527bb1aa ac/nir/cull: Change if condition for bounding box culling.
The previous code checked all_w_positive in the if condition.
Instead, always execute the bbox culling code and include
all_w_positive at the end.

We assume checking in the if is not beneficial because it's
very unlikely that there is no primitive in a wave whose W are
not all positive.

This allows moving other things to the condition
in the next commit.

Fossil DB stats on Navi 21:

Totals from 60918 (45.16% of 134906) affected shaders:
CodeSize: 160574204 -> 160330564 (-0.15%); split: -0.15%, +0.00%
Instrs: 30538297 -> 30477385 (-0.20%); split: -0.20%, +0.00%
Latency: 139810902 -> 139802763 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 21198449 -> 21198444 (-0.00%); split: -0.00%, +0.00%
SClause: 749810 -> 749811 (+0.00%)
Copies: 2701474 -> 2701482 (+0.00%); split: -0.00%, +0.00%

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17870>
2022-08-05 22:10:28 +00:00
Timur Kristóf
fb4e68b724 ac/nir/cull: Move the contents of cull_bbox into ac_nir_cull_triangle.
No Fossil DB changes.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17870>
2022-08-05 22:10:28 +00:00
Timur Kristóf
e2ca24063a ac/nir/cull: Move some code from cull_bbox into helper functions.
No Fossil DB changes.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17870>
2022-08-05 22:10:27 +00:00
Marek Olšák
0c1801706e ac/llvm: handle external textures in ac_nir_lower_resinfo
Fixes: 4f622d62d0 - ac/nir: add ac_nir_lower_resinfo
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6993

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17902>
2022-08-05 09:04:17 +00:00
Marek Olšák
4f622d62d0 ac/nir: add ac_nir_lower_resinfo
Emulating image_get_resinfo should be faster than using the hw.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17693>
2022-08-03 17:44:15 +00:00
Marek Olšák
f129db911b radeonsi/gfx11: use a better workaround for the export conflict bug
This is recommended for better performance.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17864>
2022-08-03 00:57:16 +00:00
Marek Olšák
2ed9eb1b63 radeonsi/gfx11: enable shader prefetch except for initial chip revisions
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17864>
2022-08-03 00:57:16 +00:00
Marek Olšák
9e9cc62912 radeonsi: follow shader_info.float_controls_execution_mode (mostly)
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17864>
2022-08-03 00:57:16 +00:00
Marek Olšák
5c0b0f0058 ac/surface: don't forbid 256KB swizzle modes on smaller gfx11 chips
let addrlib make the right choice

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17864>
2022-08-03 00:57:16 +00:00
Marek Olšák
bc85e79bba ac/gpu_info: require amdgpu DRM 3.15.0 (kernel 4.12) from July 2017
to match the radeon requirement

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411>
2022-07-27 05:01:38 +00:00
Marek Olšák
9f6a64b1c3 gallium/radeon: require radeon DRM 2.50.0 (kernel 4.12) from July 2017
This is the latest radeon DRM version.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411>
2022-07-27 05:01:38 +00:00
Marek Olšák
8426cf9132 ac/gpu_info: remove unused has_unaligned_shader_loads
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411>
2022-07-27 05:01:38 +00:00
Marek Olšák
f3f00f77ad ac/gpu_info: remove amdgpu_gpu_info parameter from ac_query_gpu_info
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411>
2022-07-27 05:01:38 +00:00
Marek Olšák
41888505fc ac/gpu_info: use drm_amdgpu_device_info instead of amdgpu_gpu_info
These fields are identical but the latter is from libdrm.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411>
2022-07-27 05:01:38 +00:00
Marek Olšák
abd188ec1c radeonsi: remove workarounds for radeon DRM < 2.45.0
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411>
2022-07-27 05:01:38 +00:00
Marek Olšák
3657cdafd6 amd: require amdgpu DRM 3.2.0 from April 2016
This removes an early bug workaround.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411>
2022-07-27 05:01:38 +00:00
Marek Olšák
ff19666a0d ac/gpu_info: remove redundant vcn_encode
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411>
2022-07-27 05:01:38 +00:00
Marek Olšák
89113c0338 ac/gpu_info: remove redundant vce_encode
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411>
2022-07-27 05:01:38 +00:00
Marek Olšák
9cbbdc6583 ac/gpu_info: remove redundant uvd_encode
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411>
2022-07-27 05:01:38 +00:00
Marek Olšák
2972ceccfd ac/gpu_info: remove redundant jpeg_decode
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411>
2022-07-27 05:01:38 +00:00
Marek Olšák
a0d2e16c91 ac/gpu_info: remove redundant uvd_decode
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411>
2022-07-27 05:01:38 +00:00
Marek Olšák
12c5d64fae ac/gpu_info: remove vram_size and gtt_size in favor of *_kb variants
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411>
2022-07-27 05:01:38 +00:00
Marek Olšák
983223de5d ac/gpu_info: use the kernel-reported GFX IP version to set gfx_level
hopefully this won't break the world.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411>
2022-07-27 05:01:38 +00:00
Marek Olšák
6504d7172c ac/gpu_info: use hw_ip::ip_discovery_version to set IP versions
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411>
2022-07-27 05:01:38 +00:00
Marek Olšák
9552da66cc ac/gpu_info: rework and extend device info to be more informative
This is the result with AMD_DEBUG=info:

Device info:
    name = NAVI23
    marketing_name = AMD Radeon RX 6600
    num_se = 2
    num_rb = 8
    num_cu = 28
    max_gpu_freq = 2750 MHz
    max_gflops = 9856 GFLOPS
    l0_cache_size = 16 KB
    l1_cache_size = 128 KB
    l2_cache_size = 2048 KB
    l3_cache_size = 32 MB
    memory_channels = 8 (TCC blocks)
    memory_size = 8 GB (8192 MB)
    memory_freq = 14 GHz
    memory_bus_width = 128 bits
    memory_bandwidth = 224 GB/s
    clock_crystal_freq = 100000 KHz
    IP GFX     10.3 	queues:1
    IP COMP    10.3 	queues:4
    IP SDMA     5.2 	queues:2
    IP VCN_DEC  3.0 	queues:1
    IP VCN_ENC  3.0 	queues:1
    IP VCN_JPG  3.0 	queues:1

It might not be 100% correct with other chips.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411>
2022-07-27 05:01:38 +00:00
Marek Olšák
dd6b001775 ac/gpu_info: remove tabs
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411>
2022-07-27 05:01:38 +00:00
Marek Olšák
f218c3d795 ac/gpu_info: rename info fields to num_cu, memory_bus_width, memory_freq_mhz
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411>
2022-07-27 05:01:38 +00:00
Timur Kristóf
8d7ca7783b ac/nir/ngg: Remember proper bit sizes of GS output variables.
The LLVM backend keeps track of 16-bit output variables and it will
miscompile shaders when these outputs aren't the correct bitsize.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17706>
2022-07-22 08:17:39 +00:00
Timur Kristóf
e60fbb4dc9 ac/nir/ngg: Copy comment about LDS layout for NGG GS.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17706>
2022-07-22 08:17:39 +00:00
Timur Kristóf
dd781c1ccb ac/nir/ngg: Create output variable for primitive ID export.
This makes the RADV/LLVM backend happy and mitigates a crash.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17581>
2022-07-21 21:53:29 +00:00
Timur Kristóf
b0a7db1d3b ac/nir/ngg: Move primitive ID workgroup barrier to proper place.
Previously, it was in a divergent branch, therefore
it could hang the GPU when a workgroup had a primitive-only wave.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17581>
2022-07-21 21:53:29 +00:00
Qiang Yu
754e43369d ac/nir/ngg: Decouple primitive ID store and primitive export.
There's no dependency between them.

This can simplify the compiler backend translation by
always storing prim id before vertex export, which also
benefits the LLVM backend in latter changes.

Signed-off-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17581>
2022-07-21 21:53:29 +00:00
Timur Kristóf
822e370390 radv: Allow reusing pipeline compute state emit functions.
We are going to reuse them outside of radv_pipeline.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16531>
2022-07-20 19:00:30 +00:00
Qiang Yu
eeaf0b1888 ac/nir/ngg: add a barrier before prim id export
When culling enabled, it will use LDS space, which overlap with
the prim id export.

Fixes: e97f0463a8 ("ac/nir: Implement NGG deferred attribute culling in NIR.")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17593>
2022-07-18 09:50:09 +00:00
Qiang Yu
0b7ef846b3 ac/nir/ngg: fix nogs culling scratch size
Should be in bytes not dwords.

Fixes: e97f0463a8 ("ac/nir: Implement NGG deferred attribute culling in NIR.")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17593>
2022-07-18 09:50:09 +00:00
Marek Olšák
9a39da359e ac/surface: expose all 64K_R_X and 256K_R_X modifiers on gfx11
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17410>
2022-07-09 21:00:51 +00:00
Marek Olšák
3514b73244 amd: update addrlib - trivial changes
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17410>
2022-07-09 21:00:51 +00:00
Bas Nieuwenhuizen
10211913e1 radv: Add perf counter lock/unlock commandbuffers.
These set the pass and make sure we don't have multiple submissions
at the same time touching the perf counters/pass at the same time.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16879>
2022-07-09 12:29:06 +00:00
Bas Nieuwenhuizen
6cfc2e91e8 radv: Add performance counter reg write.
Needed for reliably writing performance counter selectors.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16879>
2022-07-09 12:29:05 +00:00
Timur Kristóf
8bfeb467bf ac/nir/ngg: Ignore driver location for mesh shader outputs.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17244>
2022-07-01 18:09:07 +00:00
Timur Kristóf
2ac3e921e3 ac/nir/ngg: Refactor LDS instructions in NGG GS vertex emit and export.
Change NGG GS emit vertex code to emit combined shared stores,
also change the export vertex code to emit combined shared loads.
This results in more optimal code generation, ie. fewer LDS
instructions are generated.

GS vertices are stored using an odd stride to minimize the chance
of bank conflicts, which means that unfortunately
we still can't use an alignment higher than 4 here,
so the best we can get are some ds_read2_b32 instructions.

Fossil DB stats on Navi 21 (formerly Sienna Cichlid):

Totals from 135 (0.10% of 128653) affected shaders:
VGPRs: 6416 -> 6512 (+1.50%)
CodeSize: 529436 -> 503792 (-4.84%)
MaxWaves: 2952 -> 2924 (-0.95%)
Instrs: 93384 -> 90176 (-3.44%)
Latency: 290283 -> 293611 (+1.15%); split: -0.36%, +1.50%
InvThroughput: 81218 -> 82598 (+1.70%)
Copies: 6603 -> 6606 (+0.05%)
PreVGPRs: 5037 -> 5076 (+0.77%)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11425>
2022-06-30 18:15:50 +02:00