Samuel Pitoiset
9a6aa3e23a
aco: prevent a division by zero when patch control points is dynamic
...
tess_input_vertices is zero if the state is dynamic.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18344 >
2022-09-13 08:24:14 +00:00
Samuel Pitoiset
8fcb4aa0eb
radv: compact MRTs to save PS export memory space
...
If there are holes between color outputs (e.g. a shader exports MRT1,
but not MRT0), we can remove the holes by moving higher MRTs lower. The
hardware will remap the MRTs to their correct locations if we remove
holes in SPI_SHADER_COL_FORMAT but not CB_SHADER_MASK. This is good for
performance because the hardware will allocate less space for color
MRTs.
This also allows to remove even more unused color exports because we no
longer need to force previous targets to be non-zero. Only SotTR seems
affected from our fossils db.
fossils-db (NAVI21):
Totals from 859 (0.64% of 134913) affected shaders:
VGPRs: 24328 -> 24216 (-0.46%)
CodeSize: 1433276 -> 1422576 (-0.75%)
Instrs: 255275 -> 253728 (-0.61%)
Latency: 1666836 -> 1661544 (-0.32%)
InvThroughput: 346038 -> 343406 (-0.76%)
Copies: 16520 -> 16506 (-0.08%)
PreSGPRs: 25934 -> 25920 (-0.05%)
PreVGPRs: 19903 -> 19662 (-1.21%)
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5786 >
2022-09-07 08:17:20 +00:00
Georg Lehmann
a82b9d6001
aco: Fix image instructions with lod when 2d_view_of_3d is enabled on GFX9.
...
If there's a lod parameter it matter if the image is 3d or 2d because
the hw reads either the fourth or third component as lod. So detect
3d images and place the lod at the third component otherwise.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18114 >
2022-08-31 17:44:38 +00:00
Rhys Perry
96df4499ac
radv,aco: implement 64-bit vertex inputs
...
Note that, from 22.4.1. Vertex Input Extraction of Vulkan spec:
The input variable in the shader must be declared as a 64-bit data type if
and only if format is a 64-bit data type.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17894 >
2022-08-30 19:02:11 +00:00
Rhys Perry
831257bdce
radv,aco: use pipe_format for dynamic vertex input state
...
Also prepare for 64-bit and R8G8B8/R16G16B16 with the addition of
radv_vs_input_state::nontrivial_formats.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5021
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17894 >
2022-08-30 19:02:11 +00:00
Rhys Perry
c06a5a5ebd
radv,aco: use pipe_format for static vertex input state
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17894 >
2022-08-30 19:02:11 +00:00
Rhys Perry
6a2ada93b4
ac: add ac_vtx_format_info
...
This will be used by RADV and ACO.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17894 >
2022-08-30 19:02:11 +00:00
Daniel Schürmann
3d6ea4f666
aco: use std::vector::reserve() more often
...
This removes the majority of vector re-allocations.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18105 >
2022-08-30 16:03:26 +00:00
Rhys Perry
aa4db00c57
aco: remove dead code for querying image size/samples/levels
...
ac_nir_lower_resinfo() now lowers these.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17991 >
2022-08-30 07:37:08 +00:00
Rhys Perry
290df95870
aco: add SCC clobber in build_cube_select
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17991 >
2022-08-30 07:37:08 +00:00
Rhys Perry
636450c274
aco: allow direct_fetch=true for vec4 VS input loads
...
This seems to be a (mostly harmless) mistake from 369b8cffea .
fossil-db (navi21):
Totals from 15 (0.01% of 135636) affected shaders:
Instrs: 1992 -> 1999 (+0.35%)
Latency: 13557 -> 13567 (+0.07%); split: -0.24%, +0.31%
InvThroughput: 4059 -> 4065 (+0.15%); split: -0.20%, +0.34%
Copies: 186 -> 193 (+3.76%)
fossil-db (polaris10):
Totals from 5 (0.00% of 135610) affected shaders:
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18225 >
2022-08-26 15:28:55 +00:00
Rhys Perry
030d6f873e
aco: don't expand vec3 VS input load to vec4 on GFX6
...
Removes the (small) possibility of invalid memory access.
fossil-db (pitcairn):
Totals from 35456 (26.15% of 135610) affected shaders:
MaxWaves: 259508 -> 260642 (+0.44%); split: +0.44%, -0.01%
Instrs: 7915383 -> 7965774 (+0.64%); split: -0.09%, +0.72%
CodeSize: 37163748 -> 37524804 (+0.97%); split: -0.04%, +1.01%
SGPRs: 1515128 -> 1513576 (-0.10%); split: -0.27%, +0.17%
VGPRs: 1218376 -> 1211160 (-0.59%); split: -0.71%, +0.12%
SpillSGPRs: 1152 -> 1144 (-0.69%)
Latency: 83777626 -> 83867137 (+0.11%); split: -0.61%, +0.72%
InvThroughput: 25722445 -> 25727745 (+0.02%); split: -0.23%, +0.25%
VClause: 232058 -> 230464 (-0.69%); split: -2.53%, +1.84%
SClause: 322579 -> 322108 (-0.15%); split: -0.76%, +0.61%
Copies: 547032 -> 547954 (+0.17%); split: -1.83%, +2.00%
Branches: 72538 -> 72542 (+0.01%)
PreVGPRs: 898453 -> 897584 (-0.10%); split: -0.13%, +0.03%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18225 >
2022-08-26 15:28:55 +00:00
Rhys Perry
3260844448
aco: fix 16-bit VS inputs
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: 3fba5bb9cc ("aco: implement 16-bit vertex fetches with tbuffer_load_format_d16_*")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18225 >
2022-08-26 15:28:55 +00:00
Samuel Pitoiset
ee5b9bcc57
radv: stop duplicating radv_vs_output_info
...
Only the last vertex stage needs to access this.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18210 >
2022-08-26 14:07:09 +00:00
Samuel Pitoiset
8cd1683944
aco: fix wrong size for 1D images and A16 on GFX9
...
Size is in bytes, not bits.
Fixes plenty of crashes in CI, like
dEQP-VK.synchronization.op.single_queue.event.write_image_fragment_read_image_tess_eval.image_128_r32_uint.
Fixes: 46f6e2ddbb ("aco: Implement storage image A16.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18266 >
2022-08-26 13:30:46 +00:00
Georg Lehmann
9151048957
aco: Combine 16bit undef and constants instead of using s_pack.
...
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18106 >
2022-08-24 17:04:03 +00:00
Georg Lehmann
46f6e2ddbb
aco: Implement storage image A16.
...
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18106 >
2022-08-24 17:04:03 +00:00
Georg Lehmann
8eac45b274
nir: Add nir_ssa_scalar_is_undef.
...
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18183 >
2022-08-24 15:22:40 +00:00
Georg Lehmann
2dd641119f
aco: Force tex operand to have the correct sub dword size before packing.
...
get_ssa_temp's and NIR's bit size can differ for scalar sources.
This causes broken packing of the MIMG operands with A16/G16.
Fixes: f5f73db846 ("aco: Support 16bit sources for texture ops.")
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18008 >
2022-08-18 14:42:28 +02:00
Timur Kristóf
5ead973824
aco: Add faster code path to store_lds for consecutive write mask.
...
This makes it more likely to hit the fast path for count == 1
in the split_store_data function.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17923 >
2022-08-14 15:09:07 +00:00
Georg Lehmann
f5f73db846
aco: Support 16bit sources for texture ops.
...
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16979 >
2022-07-26 16:54:08 +00:00
Timur Kristóf
22796d91ea
aco: Remove hack for primitive ID export.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17581 >
2022-07-21 21:53:29 +00:00
Georg Lehmann
333f056edf
radv, aco: Don't lower 16bit isign.
...
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17440 >
2022-07-20 14:31:15 +00:00
Georg Lehmann
b96126ee95
radv,aco: Don't lower and vectorize 16bit iabs.
...
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17440 >
2022-07-20 14:31:15 +00:00
Daniel Schürmann
f12eb5c213
aco: avoid unnecessary copies in emit_wqm()
...
No fossil-db changes.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15347 >
2022-07-19 16:30:49 +00:00
Samuel Pitoiset
270cc39648
aco: add support for compiling PS epilogs
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17485 >
2022-07-18 18:40:02 +00:00
Samuel Pitoiset
8d13392969
aco: refactor export_fs_mrt_color() for PS epilogs preparation
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17485 >
2022-07-18 18:40:02 +00:00
Samuel Pitoiset
a6dff6caa1
aco: emit p_jump_to_epilog if the main fragment shader has an epilog
...
MRTZ is still exported from the main shader.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17485 >
2022-07-18 18:40:02 +00:00
Konstantin Seurer
82283de717
radv: Use a global address for sbt_base
...
Required for indirect(2) ray tracing to work.
Fixes the following tests:
dEQP-VK.ray_tracing_pipeline.trace_rays_indirect2.indirect_*
dEQP-VK.ray_tracing_pipeline.trace_rays_cmds_maintenance_1.indirect2_*
Fixes: 16585664 ("radv: vkCmdTraceRaysIndirect2KHR")
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17316 >
2022-07-14 16:35:31 +00:00
Konstantin Seurer
69daa3f762
radv: Use a global address for ray_launch_size
...
Required for indirect(2) ray tracing to work.
Fixes: b30f96dd ("radv,aco: Use ray_launch_size_addr")
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17316 >
2022-07-14 16:35:31 +00:00
Rhys Perry
0e783d687a
aco: use scratch_* for scratch load/store on GFX9+
...
fossil-db (navi21):
Totals from 52 (0.03% of 162293) affected shaders:
Instrs: 83190 -> 82145 (-1.26%)
CodeSize: 454892 -> 447260 (-1.68%); split: -1.68%, +0.00%
VGPRs: 4768 -> 4672 (-2.01%)
Latency: 1490887 -> 1487170 (-0.25%); split: -0.68%, +0.43%
InvThroughput: 935500 -> 933060 (-0.26%); split: -0.72%, +0.46%
VClause: 2715 -> 2632 (-3.06%); split: -4.53%, +1.47%
SClause: 1902 -> 1883 (-1.00%)
Copies: 8839 -> 8496 (-3.88%)
PreSGPRs: 2012 -> 1807 (-10.19%)
PreVGPRs: 3282 -> 3192 (-2.74%)
fossil-db (vega10):
Totals from 41 (0.03% of 161355) affected shaders:
Instrs: 35772 -> 35699 (-0.20%)
CodeSize: 187040 -> 186584 (-0.24%)
VGPRs: 4044 -> 4072 (+0.69%)
Latency: 243088 -> 242379 (-0.29%)
InvThroughput: 180301 -> 179783 (-0.29%)
VClause: 1204 -> 1216 (+1.00%)
SClause: 653 -> 637 (-2.45%)
Copies: 3736 -> 3704 (-0.86%); split: -0.88%, +0.03%
PreSGPRs: 1331 -> 1207 (-9.32%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079 >
2022-07-08 14:49:03 +00:00
Rhys Perry
d2d94b62f2
aco: initialize scratch base registers on GFX9-GFX10.3
...
fossil-db (navi21):
Totals from 1142 (0.70% of 162293) affected shaders:
Instrs: 271636 -> 271974 (+0.12%)
CodeSize: 1532020 -> 1533792 (+0.12%)
Latency: 7484066 -> 7485698 (+0.02%)
InvThroughput: 4048824 -> 4049579 (+0.02%)
SClause: 4171 -> 4212 (+0.98%)
PreSGPRs: 11203 -> 12276 (+9.58%)
fossil-db (vega10):
Totals from 3327 (2.06% of 161355) affected shaders:
Instrs: 257413 -> 257601 (+0.07%)
CodeSize: 1424244 -> 1425372 (+0.08%)
Latency: 8598402 -> 8600466 (+0.02%)
InvThroughput: 7906335 -> 7908234 (+0.02%)
SClause: 4932 -> 4973 (+0.83%)
PreSGPRs: 22010 -> 25405 (+15.42%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079 >
2022-07-08 14:49:03 +00:00
Rhys Perry
cbeb25ce91
aco: make FLAT_instruction::offset signed
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079 >
2022-07-08 14:49:03 +00:00
Samuel Pitoiset
cf46397aec
aco: fix load_barycentric_at_sample without MSAA
...
It's legal to use this instruction in a fragment shader, even if the
graphics pipeline doesn't use MSAA.
Fixes
dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_sample.non_multisample_buffer.sample_n_*.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17398 >
2022-07-08 07:28:33 +00:00
Rhys Perry
48578713b7
radv,aco,ac/llvm: use nir_op_f{sin,cos}_amd
...
This lets NIR optimize the multiplication, particularly sin/cos(a * #b).
fossil-db (Sienna Cichlid):
Totals from 12306 (7.58% of 162293) affected shaders:
MaxWaves: 224814 -> 224834 (+0.01%)
Instrs: 17365273 -> 17338758 (-0.15%); split: -0.16%, +0.00%
CodeSize: 93478488 -> 93354912 (-0.13%); split: -0.14%, +0.01%
VGPRs: 752080 -> 752072 (-0.00%); split: -0.00%, +0.00%
SpillSGPRs: 8440 -> 8410 (-0.36%)
Latency: 200402154 -> 200279405 (-0.06%); split: -0.06%, +0.00%
InvThroughput: 37588077 -> 37545545 (-0.11%); split: -0.11%, +0.00%
VClause: 293863 -> 293874 (+0.00%); split: -0.03%, +0.03%
SClause: 619539 -> 619064 (-0.08%); split: -0.09%, +0.01%
Copies: 1151591 -> 1151641 (+0.00%); split: -0.04%, +0.05%
Branches: 506434 -> 506437 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 877609 -> 877517 (-0.01%); split: -0.01%, +0.00%
PreVGPRs: 711938 -> 711940 (+0.00%); split: -0.00%, +0.00%
fossil-db (LLVM, Sienna Cichlid):
Totals from 4377 (3.59% of 121873) affected shaders:
SGPRs: 358960 -> 359176 (+0.06%); split: -0.18%, +0.25%
VGPRs: 319832 -> 319720 (-0.04%); split: -0.18%, +0.15%
SpillSGPRs: 46983 -> 47007 (+0.05%); split: -0.99%, +1.04%
CodeSize: 30872812 -> 30764512 (-0.35%); split: -0.39%, +0.04%
MaxWaves: 73814 -> 73904 (+0.12%); split: +0.25%, -0.13%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10587 >
2022-07-07 22:18:08 +00:00
Daniel Schürmann
2e895f8b04
radv: vectorize nir_op_fabs
...
Totals from 4 (0.00% of 134913) affected shaders: (GFX10.3)
CodeSize: 37868 -> 36576 (-3.41%)
Instrs: 5332 -> 5169 (-3.06%)
Latency: 24452 -> 24174 (-1.14%)
InvThroughput: 9784 -> 9462 (-3.29%)
VClause: 54 -> 50 (-7.41%)
Copies: 520 -> 519 (-0.19%)
PreVGPRs: 266 -> 264 (-0.75%)
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15176 >
2022-06-27 15:07:27 +00:00
Rhys Perry
6fc2622abd
aco: don't skip VS->TCS barrier if TCS output vertices doesn't match input
...
TCS invocations correspond to output patch vertices, not input. If they
differ, TCS invocations can be in a different subgroup than VS invocations
of the input patch.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6564
Fixes: 152092b8ea ("aco: skip s_barrier if TCS patches are within subgroup")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17174 >
2022-06-23 10:08:02 +00:00
Georg Lehmann
5ebb014bdf
radv, aco: Round texture array layer in NIR.
...
Foz-DB Navi21:
Totals from 9100 (6.75% of 134913) affected shaders:
VGPRs: 609912 -> 610104 (+0.03%); split: -0.01%, +0.05%
SpillSGPRs: 1459 -> 1489 (+2.06%)
CodeSize: 66705920 -> 66620288 (-0.13%); split: -0.13%, +0.00%
MaxWaves: 148546 -> 148518 (-0.02%); split: +0.00%, -0.02%
Instrs: 12278485 -> 12255821 (-0.18%); split: -0.19%, +0.00%
Latency: 277414916 -> 277261192 (-0.06%); split: -0.08%, +0.02%
InvThroughput: 48431180 -> 48394637 (-0.08%); split: -0.11%, +0.03%
VClause: 250866 -> 251062 (+0.08%); split: -0.04%, +0.11%
SClause: 498377 -> 498173 (-0.04%); split: -0.08%, +0.04%
Copies: 652835 -> 655371 (+0.39%); split: -0.09%, +0.48%
Branches: 284367 -> 284371 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 498580 -> 498477 (-0.02%)
PreVGPRs: 558436 -> 558709 (+0.05%); split: -0.01%, +0.06%
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16871 >
2022-06-08 20:57:22 +00:00
Konstantin Seurer
16585664cd
radv: vkCmdTraceRaysIndirect2KHR
...
This changes the trace rays logic to always use
VkTraceRaysIndirectCommand2KHR and implements
vkCmdTraceRaysIndirect2KHR. I renamed the
load_sbt_amd to sbt_base_amd and moved the SBT
load lowering from ACO to NIR.
Note that we can not just upload one pointer to
all the trace parameters because that would
be incompatible with traceRaysIndirect.
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16430 >
2022-06-08 20:20:21 +00:00
Georg Lehmann
d8493e5310
radv, aco: Lower txf offset in NIR.
...
Foz-DB Navi21:
Totals from 384 (0.28% of 134913) affected shaders:
VGPRs: 29736 -> 29536 (-0.67%)
CodeSize: 2455796 -> 2452652 (-0.13%); split: -0.13%, +0.01%
MaxWaves: 6350 -> 6358 (+0.13%)
Instrs: 457743 -> 456273 (-0.32%); split: -0.33%, +0.01%
Latency: 6680266 -> 6730612 (+0.75%); split: -0.03%, +0.78%
InvThroughput: 1562936 -> 1599375 (+2.33%); split: -0.05%, +2.38%
VClause: 9258 -> 9291 (+0.36%); split: -0.14%, +0.50%
SClause: 15713 -> 15707 (-0.04%); split: -0.08%, +0.04%
Copies: 26878 -> 27021 (+0.53%); split: -0.03%, +0.56%
PreVGPRs: 27259 -> 27230 (-0.11%); split: -0.11%, +0.01%
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16869 >
2022-06-08 08:13:01 +00:00
Rhys Perry
f4c02d9116
aco: fix SMEM load_global with VGPR address and non-zero offset
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes: 3e9517c757 ("aco: implement _amd global access intrinsics")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16775 >
2022-06-06 17:47:59 +00:00
Rhys Perry
4d9f3fcf9c
aco: fix SMEM load_global_amd with non-zero offset
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes: 3e9517c757 ("aco: implement _amd global access intrinsics")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16775 >
2022-06-06 17:47:59 +00:00
Georg Lehmann
1d815548ab
radv, aco: Packed usub_sat/isub_sat.
...
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13895 >
2022-06-01 17:09:25 +00:00
Georg Lehmann
d404f1964c
aco: Implement isub_sat.
...
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13895 >
2022-06-01 17:09:25 +00:00
Georg Lehmann
faa2a89487
aco: Implement usub_sat.
...
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13895 >
2022-06-01 17:09:25 +00:00
Georg Lehmann
529ec3d7dc
aco: Implement uclz.
...
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13895 >
2022-06-01 17:09:25 +00:00
Rhys Perry
dae1629778
aco: disable sdwa on gfx11
...
Instead of SDWA v_mov_b32/v_xor_b32, we can use a combination of
v_add_u16/v_sub_u16 (add/sub swap, similar to xor swap) and v_perm_b32
with a literal.
I don't know yet if GFX11 adds any new instructions which makes this
easier, but this approach should have full functionality.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16595 >
2022-05-31 18:07:34 +00:00
Rhys Perry
8384189b6c
aco: use p_parallelcopy for uniform reduction with zero source
...
I think v_mov_b32 was only used because a sub-dword p_parallelcopy
couldn't take constants on some gfx levels. That shouldn't be the case
anymore.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16595 >
2022-05-31 18:07:34 +00:00
Timur Kristóf
7761b4d89e
aco: Fix scratch with task shaders.
...
Task shaders work like compute shaders, their scratch pointer
is currently located at the first two user SGPRs.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16692 >
2022-05-24 12:33:49 +00:00
Samuel Pitoiset
95d4e5435b
radv: export implicit primitive ID in NIR for legacy VS or TES
...
It's implicit for VS or TES, while it's required for GS or MS.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16404 >
2022-05-20 14:55:05 +00:00