aco/isel: use s_bitcmp1 for 1bit ubfe
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run

Avoid the s_pack at the cost of having to use scc.

Foz-DB GFX1201:
Totals from 1514 (0.74% of 205032) affected shaders:
Instrs: 3443431 -> 3434096 (-0.27%); split: -0.27%, +0.00%
CodeSize: 19062100 -> 19024320 (-0.20%); split: -0.20%, +0.00%
Latency: 22343329 -> 22342802 (-0.00%); split: -0.01%, +0.01%
InvThroughput: 4471707 -> 4471632 (-0.00%); split: -0.00%, +0.00%
Copies: 280191 -> 279645 (-0.19%); split: -0.21%, +0.01%
PreSGPRs: 71333 -> 71327 (-0.01%)
VALU: 1598064 -> 1598058 (-0.00%); split: -0.00%, +0.00%
SALU: 691458 -> 686437 (-0.73%)

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40707>
This commit is contained in:
Georg Lehmann 2026-03-30 15:24:32 +02:00 committed by Marge Bot
parent 4cf6cd85e0
commit 5453419086

View file

@ -3572,7 +3572,9 @@ visit_alu_instr(isel_context* ctx, nir_alu_instr* instr)
Temp offset = get_alu_src(ctx, instr->src[1]);
Temp bits = get_alu_src(ctx, instr->src[2]);
if (ctx->program->gfx_level >= GFX9) {
if (instr->op == nir_op_ubfe && const_bits && (const_bits->u32 & 0x1f) == 1) {
bld.sopc(aco_opcode::s_bitcmp1_b32, Definition(dst, scc), base, offset);
} else if (ctx->program->gfx_level >= GFX9) {
Operand bits_op = const_bits ? Operand::c32(const_bits->u32 & 0x1f)
: bld.sop2(aco_opcode::s_and_b32, bld.def(s1),
bld.def(s1, scc), bits, Operand::c32(0x1fu));