Port from ac_nir_to_llvm.c and si_shader_llvm_resource.c.
Due to need waterfall of llvm backend, we can't get bind-texture
descriptor directly in nir. So we keep load_sampler_desc abi only
for bind-texture index to desc.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18666>
RADV use dri option to enabled this for some apps, but it's
done in nir lower currently. I'm afraid it still needs this
option to handle the non-uniform case as desc is loaded in
llvm.
radeonsi always enable this for bind-textures.
radeonsi will lower all bind-textures to bindless-textures,
and only bind-textures use desc index, so add this abi for
bindless desc index path.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18666>
There are a lot of optimizations in opt_algebraic that match ('ine', a,
0), but there are almost none that match i2b. Instead of adding a huge
pile of additional patterns (including variations that include both ine
and i2b), always lower i2b to a != 0.
At this point in the series, it should be impossible for anything to
generate i2b, so there /should not/ be any changes.
The failing test on d3d12 is a pre-existing bug that is triggered by
this change. I talked to Jesse about it, and, after some analysis, he
suggested just adding it to the list of known failures.
v2: Don't rematerialize i2b instructions in dxil_nir_lower_x2b.
v3: Don't rematerialize i2b instructions in zink_nir_algebraic.py.
v4: Fix zink-on-TGL CI failures by calling nir_opt_algebraic after
nir_lower_doubles makes progress. The latter can generate b2i
instructions, but nir_lower_int64 can't handle them (anymore).
v5: Add back most of the hunk at line 2125 of nir_opt_algebraic.py. I
had accidentally removed the f2b(bf2(x)) optimization.
v6: Just eliminate the i2b instruction.
v7: Remove missed i2b32 in midgard_compile.c. Remove (now unused)
emit_alu_i2orf2_b1 function from sfn_instr_alu.cpp. Previously this
function was still used. 🤷
No shader-db changes on any Intel platform.
All Intel platforms had similar results. (Ice Lake shown)
Instructions in all programs: 141165875 -> 141165873 (-0.0%)
Instructions helped: 2
Cycles in all programs: 9098956382 -> 9098956350 (-0.0%)
Cycles helped: 2
The two Vulkan shaders are helped because of the "new" (('b2i32',
('ine', ('ubfe', a, b, 1), 0)), ('ubfe', a, b, 1)) algebraic pattern.
Acked-by: Jesse Natalie <jenatali@microsoft.com> [earlier version]
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Daniel Schürmann <daniel@schuermann.dev> [earlier version]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
For simple intrinsic which also need other fields to translate
to LLVM like stream_id.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20158>
It's illegal and LLVM always knows which intrinsics don't read memory.
This started failing IR validation with LLVM 16.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20146>
Used by radeonsi for lower nir_atomic_add_gen/xfb_prim_count_amd.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18010>
For radeonsi which use 32bit address in ac_build_load_to_sgpr().
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18010>
LDS will be accessed starting from esgs_ring which has offset 0.
So ngg_scratch and ngg_emit base address is just the offset from
the esgs_ring base.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17109>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19966>
NIR shifts are defined to truncate the shift amount to the number of bits
needed to represent the bit-size of the value shifted. LLVM treats large
shifts as poison. This fix achieves NIR semantics for shifts.
As an example, a|(b << 32), where "a" is 32bits, should produce a|b
according to NIR (because 32&31 == 0).
This caused LLVM to incorrectly optimize "(a >> c) | (b << (32 - c))" to a
u2u32(pack_64_2x32(a, b) >> c) (v_alignbit_b32), when the original NIR
should have returned "a | b" if c==0.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19966>
Found when 16bit vec3 varying with llvm14 (not found
when llvm15), one 32bit vec4 slot is filled like this:
vec3[0] | undef
vec3[1] | undef
vec3[2] | undef
undef | undef
LLVM error is for the elements with undef:
%287 = insertelement float %280, half %279, i64 0
After this change, we get:
%287 = insertelement <2 x half> %280, half %279, i64 0
Fixes: 279eea5bda ("amd/llvm: Transition to LLVM "opaque pointers"")
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19848>