mirror of
https://gitlab.freedesktop.org/mesa/mesa.git
synced 2025-12-24 08:50:13 +01:00
intel/brw/xe3+: Adjust weights of discard control flow for non-EU-fused platforms.
Currently on platforms without EU fusion (all platforms other than
gfx12.x) we were using a constant discard_weight = 1.0 regardless of
SIMD width. This was far from ideal, in particular since it made the
performance analysis pass fully insensitive to the presence of discard
jumps, even though the scheduler is able to move code past a discard
statement so the range of the program under discard control flow can
vary and have a material effect on the relative performance of SIMD16
vs. SIMD32, since the scheduler is typically more constrained in
SIMD32 dispatch mode.
In order to fix this use a discard_weight lower than 1.0 for all
dispatch modes, so that the performance analysis pass accounts for the
presence and range of discard control flow. In addition use a lower
discard_weight for SIMD16 dispatch like we do on Gfx12.x in order to
account for the higher likelihood of divergent discard in SIMD32 mode.
The specific weights were determined iteratively on PTL based on the
final FPS result of several traces that are sensitive to the dispatch
width of one or more fragment shaders that use discard, in order to
ensure that in none of those cases we end up using the
lower-performing dispatch width variant. This avoids regressions
between 3.7% and 0.8% in Superposition-trace-dx11-2160p-extreme,
BaldursGate3-trace-dx11-1440p-ultra and
MetroExodus-trace-dx11-2160p-ultra after enabling the static
analysis-based SIMD32 heuristic in PTL.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v1)
v2: Limit to xe3+ for now since performance effect seems to be a wash
on xe2.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36618>
This commit is contained in:
parent
1272ff5ed1
commit
6ccf2a375a
1 changed files with 4 additions and 2 deletions
|
|
@ -1062,8 +1062,10 @@ namespace {
|
|||
* EU fusion has been removed on Xe2+ so its divergence behavior is
|
||||
* expected to be closer to pre-Gfx12 platforms.
|
||||
*/
|
||||
const float discard_weight = (dispatch_width > 16 || s->devinfo->ver != 12 ?
|
||||
1.0 : 0.5);
|
||||
const float discard_weight =
|
||||
s->devinfo->ver >= 30 ? (dispatch_width > 16 ? 0.75 : 0.525) :
|
||||
s->devinfo->ver == 12 ? (dispatch_width > 16 ? 1.0 : 0.5) :
|
||||
1.0;
|
||||
const float loop_weight = 10;
|
||||
unsigned halt_count = 0;
|
||||
unsigned elapsed = 0;
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue