Samuel Pitoiset
7f7ef10bea
radv: precompute fragment shader register values
...
To make emission faster.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29022 >
2024-05-06 18:00:02 +00:00
Samuel Pitoiset
e5bc4d85bb
radv: precompute existing legacy GS register values later
...
To precompute all registers at the same place.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29022 >
2024-05-06 18:00:02 +00:00
Georg Lehmann
e7b942393a
aco/tests: simplify small constant copy test
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29045 >
2024-05-06 13:38:14 +00:00
Georg Lehmann
44cc0d31b8
aco/gfx10: use v_add_u16 with literal for constant copies
...
This also means the v_perm_b32 path is now unused.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29045 >
2024-05-06 13:38:14 +00:00
Georg Lehmann
7823065f64
aco/gfx11+: use v_cvt_pk_u8_f32 for 8bit constant copies
...
Foz-DB Navi31:
Totals from 201 (0.25% of 79395) affected shaders:
Instrs: 186869 -> 186857 (-0.01%)
CodeSize: 1026760 -> 1026700 (-0.01%); split: -0.01%, +0.00%
Latency: 2302050 -> 2301969 (-0.00%)
InvThroughput: 739466 -> 739431 (-0.00%)
Copies: 26467 -> 26454 (-0.05%)
VALU: 93529 -> 93516 (-0.01%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29045 >
2024-05-06 13:38:14 +00:00
Samuel Pitoiset
92337aff03
radv: split cmdbuf dirty flags into dirty/dirty_dynamic
...
We are out of bits.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29039 >
2024-05-06 08:33:37 +02:00
Georg Lehmann
603982ea80
nir/opt_16bit_tex_image: optimize packed conversions too
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28730 >
2024-05-04 15:01:45 +00:00
Georg Lehmann
e63afdc681
radv: always run nir_opt_16bit_tex_image
...
The pass can optimize pack_half and constants sources even when
no 16bit instructions exist.
Foz-DB Navi21:
Totals from 3042 (3.83% of 79395) affected shaders:
MaxWaves: 69039 -> 69031 (-0.01%); split: +0.01%, -0.02%
Instrs: 2292054 -> 2291874 (-0.01%); split: -0.03%, +0.02%
CodeSize: 12567868 -> 12544888 (-0.18%); split: -0.23%, +0.05%
VGPRs: 145384 -> 145352 (-0.02%); split: -0.06%, +0.04%
SpillSGPRs: 451 -> 452 (+0.22%)
Latency: 23546543 -> 23536416 (-0.04%); split: -0.07%, +0.03%
InvThroughput: 5180446 -> 5164437 (-0.31%); split: -0.35%, +0.04%
VClause: 50537 -> 50535 (-0.00%); split: -0.05%, +0.04%
SClause: 84726 -> 84750 (+0.03%); split: -0.04%, +0.06%
Copies: 140384 -> 140421 (+0.03%); split: -0.34%, +0.37%
Branches: 40412 -> 40413 (+0.00%)
PreVGPRs: 120213 -> 120262 (+0.04%); split: -0.03%, +0.07%
VALU: 1607545 -> 1607593 (+0.00%); split: -0.03%, +0.03%
SALU: 215846 -> 215837 (-0.00%); split: -0.03%, +0.02%
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28730 >
2024-05-04 15:01:44 +00:00
Georg Lehmann
3a35522c8a
radv, radeonsi: don't use D16 for f2f16_rtz
...
D16 rounds towards zero for fp32 -> fp16, but for fixed point it rounds to
nearest even in fp16. MIMG without D16 also rounds to nearest even, but in fp32.
This means D16 and f2f16_rtz(tex@32) can produce different results.
Sadly this also means we can never use d16 if fp16 rounding isn't undefined.
Cc: mesa-stable
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28730 >
2024-05-04 15:01:44 +00:00
Georg Lehmann
4287358f59
ac/nir: explicitly use pack_half_2x16_rtz
...
rtz matters for constant folding.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28730 >
2024-05-04 15:01:44 +00:00
Daniel Schürmann
ce51e48cb6
radv: move nir_opt_dead_cf() before nir_opt_loop()
...
This can avoid unnecessary CF transformations.
Totals from 557 (0.70% of 79395) affected shaders: (GFX11)
MaxWaves: 12020 -> 12028 (+0.07%)
Instrs: 4237096 -> 4234110 (-0.07%); split: -0.08%, +0.01%
CodeSize: 21731952 -> 21719556 (-0.06%); split: -0.06%, +0.00%
VGPRs: 40492 -> 40480 (-0.03%)
SpillSGPRs: 467 -> 416 (-10.92%)
Latency: 25704891 -> 25684156 (-0.08%); split: -0.10%, +0.02%
InvThroughput: 5545224 -> 5542998 (-0.04%); split: -0.06%, +0.02%
VClause: 107850 -> 107838 (-0.01%); split: -0.02%, +0.01%
SClause: 90450 -> 90440 (-0.01%); split: -0.05%, +0.04%
Copies: 292714 -> 291354 (-0.46%); split: -0.50%, +0.03%
Branches: 133630 -> 133617 (-0.01%); split: -0.01%, +0.00%
PreSGPRs: 42299 -> 42104 (-0.46%); split: -0.48%, +0.02%
PreVGPRs: 36396 -> 36393 (-0.01%); split: -0.02%, +0.01%
VALU: 2321811 -> 2321192 (-0.03%); split: -0.03%, +0.01%
SALU: 505001 -> 503289 (-0.34%); split: -0.35%, +0.01%
SMEM: 132622 -> 132640 (+0.01%)
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28150 >
2024-05-03 13:01:29 +00:00
Daniel Schürmann
4453971fbb
radv: mark nir_opt_loop() as not idempotent
...
This pass misses opportunities because foreach_list_typed_safe()
might point to disconnected cf_nodes after some optimization got
applied. No fossil-db changes.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28150 >
2024-05-03 13:01:29 +00:00
Samuel Pitoiset
2e38cc06f8
radv/ci: document a recent regression on GFX6-8
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29037 >
2024-05-03 10:11:24 +00:00
Samuel Pitoiset
d71d189790
radv: add a new dirty state for emitting the color output state
...
SPI_SHADER_COL_FORMAT/CB_SHADER_MASK are used slightly differently
for PS epilogs, shader objects and monolithic graphics pipelines.
This introduces a new state that will allow us to emit these two
registers in only place. The main motivation is for depth-only RB+
support and for tracking context registers in the cmdbuf.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28976 >
2024-05-03 06:29:05 +00:00
Samuel Pitoiset
66d4188ec5
radv: store cb_shader_mask for fragment shaders and epilogs
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28976 >
2024-05-03 06:29:05 +00:00
Samuel Pitoiset
0ce1bfc040
radv: rename col_format_non_compacted to spi_shader_col_format
...
This is always the non-compacted format because it's compacted right
before it's emitted. This looks much cleaner to me.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28976 >
2024-05-03 06:29:05 +00:00
Samuel Pitoiset
199f521804
radv: compact SPI_SHADER_COL_FORMAT as late as possible
...
This will allow us to do more cleanups because this thing is a complete
mess.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28976 >
2024-05-03 06:29:05 +00:00
Samuel Pitoiset
e1483d022b
radv: clear unwritten color attachments for monolithic PS earlier
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28976 >
2024-05-03 06:29:04 +00:00
Samuel Pitoiset
3b41fbd4b8
radv: precompute compute/task shader register values
...
To make emission faster.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29014 >
2024-05-03 06:07:46 +00:00
Georg Lehmann
d4084f7f09
aco/lower_to_hw: remove gfx6/7 subdword paths
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28836 >
2024-05-02 11:09:36 +00:00
Georg Lehmann
6ecbda83f8
aco/ra: remove gfx6/7 subdword paths
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28836 >
2024-05-02 11:09:35 +00:00
Georg Lehmann
d914ff3aa5
aco: add tests for lower_subdword
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28836 >
2024-05-02 11:09:35 +00:00
Georg Lehmann
47566d0df3
aco: add a subdword lowering pass
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28836 >
2024-05-02 11:09:35 +00:00
Georg Lehmann
6b35de971c
aco/lower_to_hw: don't use regClass to identify subdword reductions
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28836 >
2024-05-02 11:09:35 +00:00
Samuel Pitoiset
8c4d0b287f
radv: emit compute pipelines directly from the cmdbuf
...
Using this intermediate CS isn't really useful and it prevents us to
optimize register writes in the near future. This will also be removed
for graphics pipelines.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28977 >
2024-05-02 10:39:03 +00:00
Timur Kristóf
72a73a6f8a
ac/nir/legacy: Use new pre-rasterization output info helper.
...
For legacy VS/TES and GS.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28936 >
2024-05-02 12:05:52 +02:00
Timur Kristóf
4ac0727f87
ac/nir/ngg: Use new pre-rasterization output info helper.
...
For NGG VS/TES and GS.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28936 >
2024-05-02 12:05:39 +02:00
Timur Kristóf
b1819d60ea
ac/nir: Add helper for pre-rasterization output info.
...
This is made to unify the handling of outputs in all
different pre-rasterization lowerings.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28936 >
2024-05-02 12:05:08 +02:00
Timur Kristóf
039e739eea
ac/nir: Move some helpers to new file.
...
Also remove nir_builder include from ac_nir.h.
This is done so that driver code doesn't need to be recompiled
when some internal parts of ac/nir in the new helper header
is changed.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28936 >
2024-05-02 12:04:53 +02:00
Timur Kristóf
cd66b77af0
aco: Add missing nir_builder include.
...
We would like to avoid including it in ac_nir.h
so ACO will need to include nir_builder.h on its own.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28936 >
2024-05-02 12:04:04 +02:00
Marek Olšák
d4cfcbdde8
nir: add ACCESS_CP_GE_COHERENT_AMD
...
required by amd gfx12
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-By: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28889 >
2024-04-30 17:17:25 +00:00
Samuel Pitoiset
86281ef15f
radv: add shaders BO to the cmdbuf BO list at bind time
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28965 >
2024-04-30 07:18:08 +00:00
Samuel Pitoiset
42554e81b9
radv: add RT prolog BO to the cmdbuf BO list at bind time
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28965 >
2024-04-30 07:18:08 +00:00
Samuel Pitoiset
42dc4b463b
radv: add GS copy shader BO to the cmdbuf BO list at bind time
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28965 >
2024-04-30 07:18:08 +00:00
Samuel Pitoiset
2664e058de
radv: use the bound GS copy shader when emitting shader objects
...
Similar but doesn't rely on shader_objs.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28965 >
2024-04-30 07:18:08 +00:00
Samuel Pitoiset
be98fe2724
radv: pre-compute VGT_TF_PARAM.DISTRIBUTION_MODE
...
For less CPU overhead.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28965 >
2024-04-30 07:18:08 +00:00
Samuel Pitoiset
d7679c0370
radv: remove useless DB_Z_INFO.NUM_SAMPLES when emitting the MSAA state
...
DB_Z_INFO.NUM_SAMPLES is now correctly set when a null framebuffer is
emitted and this is redundant.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28965 >
2024-04-30 07:18:08 +00:00
Samuel Pitoiset
4dd682e227
radv: inline radv_get_pa_su_sc_mode_cntl() in radv_emit_culling()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28965 >
2024-04-30 07:18:08 +00:00
Samuel Pitoiset
e651a2c856
radv: simplify radv_emit_primitive_restart_enable()
...
Move emitting VGT_MULTI_PRIM_IB_RESET_INDX into the GFX6-8 branch.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28965 >
2024-04-30 07:18:08 +00:00
Marek Olšák
8416ba9c25
amd/ci: 17 piglit failures are fixed for raven
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28846 >
2024-04-30 06:47:21 +00:00
Marek Olšák
c87ce78d10
ac/surface: enable thick tiling for 3D textures for better perf on gfx6-8
...
This increases performance 2.5x for Viewperf/Energy on Tonga.
The value of thick_tiling is also fixed.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28846 >
2024-04-30 06:47:20 +00:00
Marek Olšák
33f642aa09
ac/surface: disable DCC for 3D textures on gfx9 to improve performance
...
This improves Viewperf/Energy perf by 60% on Vega10.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28846 >
2024-04-30 06:47:20 +00:00
Marek Olšák
e05aec3fcd
ac/gpu_info: set tcc_rb_non_coherent only if number of TCCs != number of RBs
...
This sets it to false for Navi31 to eliminate unnecessary L2 cache
invalidations.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28846 >
2024-04-30 06:47:20 +00:00
Samuel Pitoiset
0b51868193
radv: remove bogus VkShaderCreateInfoEXT::flags being 0 assert for compute
...
This was a leftover. Flags can be different than 0, like for required
subgroup size and it should already be correctly supported.
Fixes recent dEQP-VK.shader_object.performance.dispatch_base.
Fixes: 37d7c2172b ("radv: add support for creating/destroying shader objects")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28946 >
2024-04-29 11:45:03 +00:00
Samuel Pitoiset
85deb9f706
radv: simplify DB_Z_INFO.NUM_SAMPLES with null ds target on GFX11
...
According to PAL, the hw uses the smaller value of
DB_Z_INFO.NUM_SAMPLES and PA_SC_AA_CONFIG.MSAA_EXPOSED_SAMPLES when
there is no bound depth/stencil buffer, and it uses 8x to make sure
the used value is MSAA_EXPOSED_SAMPLES.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28952 >
2024-04-29 11:02:02 +00:00
Samuel Pitoiset
dfe5e56671
radv/ci: add more flakes
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28963 >
2024-04-29 08:34:45 +02:00
David Rosca
1f07f5a79b
radv/video: Report maxBitrate in encode capabilities
...
Some cards can do higher bitrate, but 1000 Mbit/s should be high enough
for any practical use. It's also the value that AMF reports as max bitrate.
Fixes: 54d499818c ("radv/video: add initial support for encoding with h264.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28736 >
2024-04-26 09:18:29 +00:00
David Rosca
c210bb7952
radv/video: Check encode profiles and bit depth in capabilities query
...
Fixes: 967e4e09de ("radv/video: add h265 encode support")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28736 >
2024-04-26 09:18:29 +00:00
David Rosca
2d0282f576
radv/video: Set correct bit depth and format for 10bit input
...
Fixes: 967e4e09de ("radv/video: add h265 encode support")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11011
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28736 >
2024-04-26 09:18:29 +00:00
Rhys Perry
ae866966e6
aco/tests: add tests for divergent merge phi with undef
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28661 >
2024-04-26 08:39:01 +00:00