radv: double pixel throughput in certain cases of PS without interpolated inputs
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run

This reduces the number of initialized VGPRs by 1 when no barycentric
coordinates are used.

I have verified with zink that this indeed increases performance for
cases where sysvals like frag_coord and front_face are used without
interpolated PS inputs.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38936>
This commit is contained in:
Marek Olšák 2025-12-13 01:20:10 -05:00 committed by Marge Bot
parent 8cf154d2eb
commit 3c5c96fedb

View file

@ -3735,9 +3735,12 @@ radv_compute_spi_ps_input(const struct radv_physical_device *pdev, const struct
spi_ps_input |= S_0286CC_PERSP_CENTER_ENA(1); spi_ps_input |= S_0286CC_PERSP_CENTER_ENA(1);
} }
if (!(spi_ps_input & 0x7F)) { if (!(spi_ps_input & 0x7F) && !G_0286CC_LINE_STIPPLE_TEX_ENA(spi_ps_input)) {
/* At least one of PERSP_* (0xF) or LINEAR_* (0x70) must be enabled */ /* At least one of PERSP_* (0xF) or LINEAR_* (0x70) or LINE_STIPPLE_TEX must be enabled.
spi_ps_input |= S_0286CC_PERSP_CENTER_ENA(1); * LINE_STIPPLE_TEX uses the least number of initialized VGPRs, so let's use it because
* pixel throughput is limited by the number of initialized VGPRs.
*/
spi_ps_input |= S_0286CC_LINE_STIPPLE_TEX_ENA(1);
} }
return spi_ps_input; return spi_ps_input;