mirror of
https://gitlab.freedesktop.org/mesa/mesa.git
synced 2025-12-20 11:40:10 +01:00
brw: only initialize sample mask flag if needed
This is a refinement of 7c129d9365 ("intel/brw/xe2+: Keep PS sample mask in the
f1.0 register whether or not kill is used."). Rather than always insert this
move, do so only when we'll actually read the register: for memory writes and
for discards. This deletes an instruction from piles of fragment shaders.
shader-db on LNL:
total instructions in shared programs: 17134031 -> 17042706 (-0.53%)
instructions in affected programs: 9065743 -> 8974418 (-1.01%)
helped: 65045
HURT: 0
helped stats (abs) min: 1.0 max: 3.0 x̄: 1.40 x̃: 1
helped stats (rel) min: <.01% max: 50.00% x̄: 3.06% x̃: 1.64%
95% mean confidence interval for instructions value: -1.41 -1.40
95% mean confidence interval for instructions %-change: -3.10% -3.03%
Instructions are helped.
total cycles in shared programs: 885172098 -> 884835306 (-0.04%)
cycles in affected programs: 590294230 -> 589957438 (-0.06%)
helped: 53636
HURT: 4500
helped stats (abs) min: 2.0 max: 1126.0 x̄: 8.02 x̃: 4
helped stats (rel) min: <.01% max: 50.00% x̄: 1.24% x̃: 0.24%
HURT stats (abs) min: 2.0 max: 7706.0 x̄: 20.77 x̃: 6
HURT stats (rel) min: <.01% max: 82.06% x̄: 1.09% x̃: 0.54%
95% mean confidence interval for cycles value: -6.15 -5.43
95% mean confidence interval for cycles %-change: -1.10% -1.02%
Cycles are helped.
LOST: 385
GAINED: 47
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38665>
This commit is contained in:
parent
aa9435f5d1
commit
e3328dfa2f
1 changed files with 5 additions and 2 deletions
|
|
@ -1372,9 +1372,12 @@ run_fs(brw_shader &s, bool allow_spilling, bool do_rep_send)
|
|||
}
|
||||
|
||||
/* We handle discards by keeping track of the still-live pixels in f0.1.
|
||||
* Initialize it with the dispatched pixels.
|
||||
* On Xe2+, we also predicate stores with this mask. Initialize it with
|
||||
* the dispatched pixels if we use discard or (on Xe2) memory stores.
|
||||
*/
|
||||
if (devinfo->ver >= 20 || wm_prog_data->uses_kill) {
|
||||
if ((devinfo->ver >= 20 && nir->info.writes_memory) ||
|
||||
wm_prog_data->uses_kill) {
|
||||
|
||||
const unsigned lower_width = MIN2(s.dispatch_width, 16);
|
||||
for (unsigned i = 0; i < s.dispatch_width / lower_width; i++) {
|
||||
/* According to the "PS Thread Payload for Normal
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue