Commit graph

18 commits

Author SHA1 Message Date
Pierre-Eric Pelloux-Prayer
3bcbd11a33 aco/isel: fix visit_tex handling of is_sparse
For cases when less than 4 components are read, the original code
would compute an incorrect dmask. eg: with a single component + is_sparse,
the dmask was 0x13:
  - 0x 3 = coming from nir_def_components_read
  - 0x10 = the sparse bit
While it should have at 2 bits set (1 for the color/depth, 1 for tfe).

This caused problem when expand_vector() used the dmask to generate
the final results, because the value for the sparse component was
read from the wrong index.

So after the call to emit_mimg() dmask needs to be adjusted
because the components will be stored in order, so if mask is 0x11
the tfe value would be stored at invalid index=5 (while it should
be at index=1).

This fixes KHR-GL46.sparse_texture_clamp_tests.SparseTextureClampLookupResidency_texture_2d_depth_component16
and KHR-GL46.sparse_texture2_tests.SparseTexture2Lookup_texture_2d_depth_component16
with ACO.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35206>
2025-06-11 12:11:28 +00:00
Georg Lehmann
d95e90ab5f aco: do not use v_cvt_pk_u8_f32 for f2u8
The ISA docs don't mention this, but instead of always truncating
like other integer conversions, this opcode actually uses the single
precision rounding mode.

We could continue to use the opcode and set the rounding mode to rtz
in lower_to_hw_instrs, but I think I should just concede that f2u8
isn't worth the effort.

Fixes: 9bb10b58 ("aco: use v_cvt_pk_u8_f32 for f2u8")

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35391>
2025-06-10 07:32:04 +00:00
Marek Olšák
80236f2367 ac/nir/tess: add if/endif for HS threads in NIR instead of ACO/LLVM
This just removes the if/endif wrapping for LLVM, and hopefully the ACO
change does the same thing.

ACO had redundant code in endif_merged_wave_info, which is removed here.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>
2025-06-07 16:29:38 +00:00
Georg Lehmann
a6675f35b2 aco: clamp exponent of 16bit ldexp
The hw uses only a 16bit int, but NIR's src is 32bit.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34073>
2025-06-03 06:34:18 +00:00
Samuel Pitoiset
9692ef41a3 aco: implement bitfield_extract for 8-bit/16-bit
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35199>
2025-05-29 12:24:59 +00:00
Samuel Pitoiset
8596150ae8 aco: implement bitfield_reverse for types other than 32-bits
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34583>
2025-05-28 09:52:12 +00:00
Daniel Schürmann
5b4d284493 aco/isel: use vector-aligned operands for image_bvh64_intersect_ray
Totals from 93 (0.12% of 79377) affected shaders: (Navi48)
MaxWaves: 1376 -> 1368 (-0.58%)
Instrs: 3583500 -> 3581861 (-0.05%); split: -0.05%, +0.00%
CodeSize: 18792300 -> 18785296 (-0.04%); split: -0.04%, +0.00%
VGPRs: 8652 -> 8592 (-0.69%); split: -1.25%, +0.55%
Latency: 20861347 -> 20834407 (-0.13%); split: -0.17%, +0.04%
InvThroughput: 4032604 -> 4028020 (-0.11%); split: -0.14%, +0.03%
VClause: 90507 -> 90525 (+0.02%); split: -0.01%, +0.03%
Copies: 279429 -> 277839 (-0.57%); split: -0.58%, +0.01%
Branches: 100260 -> 100251 (-0.01%)
PreVGPRs: 8949 -> 8771 (-1.99%)
VALU: 1955635 -> 1954053 (-0.08%); split: -0.08%, +0.00%
SALU: 477347 -> 477329 (-0.00%); split: -0.01%, +0.01%
VOPD: 69 -> 61 (-11.59%)

Totals from 93 (0.12% of 79377) affected shaders: (Navi31)

MaxWaves: 1376 -> 1374 (-0.15%)
Instrs: 3442606 -> 3440344 (-0.07%); split: -0.07%, +0.00%
CodeSize: 17801008 -> 17790476 (-0.06%); split: -0.07%, +0.01%
VGPRs: 8652 -> 8556 (-1.11%); split: -1.25%, +0.14%
Latency: 20590943 -> 20542279 (-0.24%); split: -0.27%, +0.03%
InvThroughput: 3978133 -> 3969497 (-0.22%); split: -0.25%, +0.03%
VClause: 91784 -> 91769 (-0.02%); split: -0.05%, +0.03%
Copies: 277177 -> 275263 (-0.69%); split: -0.70%, +0.01%
Branches: 100098 -> 100092 (-0.01%); split: -0.02%, +0.01%
PreVGPRs: 9021 -> 8843 (-1.97%)
VALU: 2001794 -> 1999893 (-0.09%); split: -0.10%, +0.00%
SALU: 419504 -> 419559 (+0.01%); split: -0.01%, +0.02%
VOPD: 77 -> 64 (-16.88%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34359>
2025-05-28 09:24:17 +00:00
Daniel Schürmann
64eed6807a aco/isel: move visit_intrinsic() into separate file
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34977>
2025-05-16 11:01:19 +00:00
Daniel Schürmann
8aae636e38 aco/isel: move visit_alu_instr() into separate file
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34977>
2025-05-16 11:01:19 +00:00
Daniel Schürmann
5342576789 aco/isel: rename aco_instruction_selection.cpp -> aco_isel_nir.cpp
Also remove some unused includes and unnecessary static specifiers.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34977>
2025-05-16 11:01:19 +00:00
Daniel Schürmann
b6442669c1 aco/isel: move select_ps_epilog() into separate file
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34977>
2025-05-16 11:01:19 +00:00
Daniel Schürmann
776384d99d aco/isel: move select_ps_prolog() into separate file
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34977>
2025-05-16 11:01:19 +00:00
Daniel Schürmann
c3ef927e31 aco/isel: move select_vs_prolog() into separate file
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34977>
2025-05-16 11:01:19 +00:00
Daniel Schürmann
c4ec430c26 aco/isel: move select_rt_prolog() into separate file
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34977>
2025-05-16 11:01:19 +00:00
Daniel Schürmann
4d910ca301 aco/isel: move select_trap_handler_shader() into separate file
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34977>
2025-05-16 11:01:19 +00:00
Daniel Schürmann
146ce57f2d aco/isel: move control-flow helper functions into separate file
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34977>
2025-05-16 11:01:19 +00:00
Daniel Schürmann
59f314a9a6 aco/isel: move some helper functions into a separate file
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34977>
2025-05-16 11:01:19 +00:00
Daniel Schürmann
62a92417ef aco: move instruction selection files to /compiler/instruction selection/ subfolder
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34977>
2025-05-16 11:01:19 +00:00