Stop using tgsi_get_gl_frag_result_semantic for fragment outputs. The
direct RC path only needs stable output register indices plus the
OutputColor/OutputDepth mappings, so use NIR gl_frag_result locations
instead.
Assisted-by: Codex (GPT-5.5)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41577>
Store rc_opcode directly in struct ntr_insn, populate it with
RC_OPCODE_* values throughout (the few mismatched names get
explicit ntr_OP wrappers: KILL -> KILP, KILL_IF -> KIL), and use
rc_get_opcode_info instead of tgsi_get_opcode_info when walking
the list to emit RC instructions.
Assisted-by: Codex (GPT-5.5)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41577>
Instead of going through the ureg / TGSI tokens / r300_tgsi_to_rc
parsing round-trip, walk the ntr_insn list and rc_constants_add /
rc_insert_new_instruction the result straight into the
radeon_compiler that the caller passes in. Translation reuses the
rc_translate_* helpers extracted in the previous commit.
Changes touching the surrounding code:
- nir_to_rc returns void and takes a struct radeon_compiler *.
- ntr_compile tracks immediates and UBO size in its own
util_dynarray instead of relying on ureg_DECL_immediate /
ureg_DECL_constant2D's bookkeeping.
- ntr_output_decl tracks the FS output color/depth indices so
nir_to_rc can populate compiler->OutputColor[] /
compiler->OutputDepth at the end - find_output_registers is gone.
- r300_translate_{fragment,vertex}_shader drop the tgsi_scan_shader
+ r300_tgsi_to_rc + ttr.error dance and switch to checking
compiler.Base.Error.
- write_all (gl_FragColor vs gl_FragData[0]) now comes from a NIR
walk in r300_translate_fragment_shader rather than reading the
TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS property.
- r300_tgsi_to_rc.{c,h} are deleted, meson.build updated, and the
obsolete header includes go away in r300_fs.c / r300_vs.c.
Assisted-by: Codex (GPT-5.5)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41577>
Pull translate_opcode, translate_register_file, translate_saturate
and the texture-target switch out of r300_tgsi_to_rc into
nir_to_rc.h as static inline rc_translate_* helpers. r300_tgsi_to_rc
now uses them and this is preparation for direct RC emission from
nir_to_rc.c
Assisted-by: Codex (GPT-5.5)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41577>
This was moved from r300_optimize_nir previously because we that was
called in finalize_nir and thus could be called more than once. This is
not the case anymore. Also drop the stale nine optimization.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41577>
Stop relying on ureg_DECL_immediate's "expand" path to fill earlier
TGSI immediates' unused components with values from later load_const
instructions and depend on later backed pass to do it.
Mostly a wash on shader-db: sub-0.1% regressions on inst/cycles/
consts on RV530/RV370/RV410, with one LOST shader on RV370
(trine/fp-17.shader_test FS).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41577>
Drop the TGSI ureg roundtrip in r300_dummy_fragment_shader and
construct the (0, 0, 0, 1) FS straight via nir_builder, matching
the rest of the compile pipeline that already runs on NIR.
Assisted-by: Codex (GPT-5.5)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41577>
The NIR IO intrinsics already carry the locations and register bases
used for the generated declarations, so fill r300_shader_semantics while
emitting the NIR loads and stores.
This removes the FS input semantic scan and lets the VS output setup use
the same NIR-derived information. Track the total number of used
inputs/outputs as well.
Also stop depending on tgsi_info for the external constants.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41577>
Always convert TGSI shaders to NIR up-front in r300_create_{fs,vs}_state
so the rest of the compile pipeline only ever has to deal with NIR. The
TGSI->RC translation in r300_translate_{fragment,vertex}_shader now
always goes through nir_to_rc.
This requires shifting r300_blitter_draw_rectangle's sprite_coord_enable
bit from 0 to 9. The blitter's GENERIC[0] FS input now lands at
fs_inputs->generic[9] after the +9 shift in ntr_fixup_varying_slots, so
the rasterizer's sprite_coord_enable check needs the matching bit.
The draw path still needs TGSI, so we convert it back explicitly for
now. The deTGSIzation of draw paths will come later.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41577>
We don't usually document the details about the various fields here.
Let's drop these, as the comments don't exits in future docs either,
making it a bit easier to diff them.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41721>
The distinction between 8 and 4 bit here is kinda meaningless, because
the extra bits are zero regardless. This matches what the spec says, and
is also consistent with the other XML definitions we have. So let's make
it consistent so we can more easily diff the XML files to see what
changed.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41721>
Most of the XML files does this, so let's do the same here. This
shouldn't matter in practice, as we always set the field anyway.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41721>
This makes things more consistent, but shouldn't make a practical
difference apart from reducing needless diffs.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41721>
We don't specify these for V10, and the default is the zero-value
anyway. Let's drop them to simplify things and reduce needless diffs.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41721>
We do this for all masks from V12 and later, but not always in earlier
gens. Let's fix this up to both produce cleaner dumps, as well as reduce
needless diffs between the files.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41721>
This is what cs_sync32_add etc expects. Not sure why this isn't
producing compile-time errors, but we should be consistent here anyway.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41721>
This is what we're doing for later gens, so let's be consistent. While
we're at it, fix up the casing of the names as well.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41721>
"Scissor mode" is the name that the HW spec gives to this bit, but it's
more accurately described as "Scissor to bounding box", which is what we
use for V9 V10. It seems that name was picked intentionally over what the
HW spec calls it.
We should do the same for later gens as well, as this keeps the code
simpler.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41721>
The field here is 8 bit wide according to the spec. However, because all
values that takes more than two bits are reserved, this doesn't actually
lead to any misbehavior. But let's make this consistent with the spec
and newer XML files.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41721>
This field is specified to be 4 bit wide, not 3. This doesn't make a
practical difference, because all values with the top bit set are
undefined, so it will always be zero. But we should get it right to
reduce needless diffs here.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41721>
Most of the work to support predicating draws is already done, mainly
just needed to support predicating dispatches and wire it up.
Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41839>
The scheduler expects that dest values that are marked as pin_group
are used as src values in some instruction that takes a vec4 as source,
otherwise the free channels in the vec4 group are not evaluated correctly.
Fix the extra instructions when lowering buf_txf to backend IR to use free
ALU dest registers.
Fixes: 13b1069a87 ("r600/sfn: Handle pre-EG buffer fetch")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15433
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41835>
If we already have 2.0 do not add separate -2.0. The negate is for free
if it is only used as a scalar. We could also do this for constant
vectors, but only for vs, since for fs, we can have only per source
negate, not per channel, so keep it simple for now.
Shader-db RV410:
total consts in shared programs: 86348 -> 86272 (-0.09%)
consts in affected programs: 6036 -> 5960 (-1.26%)
helped: 76
HURT: 0
total cycles in shared programs: 175335 -> 175332 (<.01%)
cycles in affected programs: 10868 -> 10865 (-0.03%)
helped: 56
HURT: 30
total temps in shared programs: 19487 -> 19510 (0.12%)
temps in affected programs: 362 -> 385 (6.35%)
helped: 9
HURT: 8
total instructions in shared programs: 118451 -> 118461 (<.01%)
instructions in affected programs: 8105 -> 8115 (0.12%)
helped: 51
HURT: 29
LOST: 3
GAINED: 8
Most notably we again compile all glamor shaders, gain 2 tropics ones
and trade 3 lost for 2 gained in gsk, which doesn't matter much since it
will fallback to software after first linking failure anyway.
RV530:
total cycles in shared programs: 191425 -> 191385 (-0.02%)
cycles in affected programs: 6249 -> 6209 (-0.64%)
helped: 45
HURT: 12
total consts in shared programs: 94030 -> 93967 (-0.07%)
consts in affected programs: 5803 -> 5740 (-1.09%)
helped: 63
HURT: 0
total temps in shared programs: 17037 -> 17040 (0.02%)
temps in affected programs: 49 -> 52 (6.12%)
helped: 1
HURT: 3
total instructions in shared programs: 128823 -> 128789 (-0.03%)
instructions in affected programs: 6164 -> 6130 (-0.55%)
helped: 49
HURT: 19
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41618>
After removing the TGSI layer, load_const values will be emitted directly
as RC immediates without the scalar packing that tgsi_ureg used to do.
This can push fragment shaders past the 32-slot hardware limit on R3xx/R4xx.
Swap dead_constants and dataflow_swizzles pass order so constant
compaction runs before swizzle legalization, giving the legalization
pass an accurate slot count to work with.
In rc_remove_unused_constants, when the slot budget is tight on
R3xx/R4xx, enable aggressive packing for vec-used immediates.
Deduplicate repeated values within an immediate and merge subsequent vec
immediates into existing slots by matching values and filling free
channels.
Very small win on R5xx and very small hit on R3xx/R4xx (due to smaller
amount of legal swizzles).
Shader-db RV530:
total cycles in shared programs: 191452 -> 191425 (-0.01%)
cycles in affected programs: 5168 -> 5141 (-0.52%)
helped: 24
HURT: 10
total temps in shared programs: 17046 -> 17037 (-0.05%)
temps in affected programs: 201 -> 192 (-4.48%)
helped: 11
HURT: 5
total consts in shared programs: 94033 -> 94030 (<.01%)
consts in affected programs: 277 -> 274 (-1.08%)
helped: 5
HURT: 5
total instructions in shared programs: 128840 -> 128823 (-0.01%)
instructions in affected programs: 3588 -> 3571 (-0.47%)
helped: 25
HURT: 12
RV410:
total cycles in shared programs: 176230 -> 176270 (0.02%)
cycles in affected programs: 20598 -> 20638 (0.19%)
helped: 51
HURT: 66
total temps in shared programs: 19655 -> 19650 (-0.03%)
temps in affected programs: 1310 -> 1305 (-0.38%)
helped: 37
HURT: 25
total instructions in shared programs: 119346 -> 119379 (0.03%)
instructions in affected programs: 13884 -> 13917 (0.24%)
helped: 58
HURT: 65
total consts in shared programs: 86146 -> 86412 (0.31%)
consts in affected programs: 3093 -> 3359 (8.60%)
helped: 8
HURT: 182
Assisted-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41618>