mirror of
https://gitlab.freedesktop.org/mesa/mesa.git
synced 2026-05-05 05:18:08 +02:00
freedreno/ir3: Consistently lower mediump inputs to 16-bit (when we can).
If every use was a conversion to 16, then ir3_cf would fold it into the bary instruction. But if something had generated a highp comparison of the mediump input with a mediump op result, it would get stuck as highp, even though we could have used 16-bit values without upconverting. This fixes dEQP-GLES2.functional.shaders.algorithm.rgb_to_hsl_fragment on ANGLE on turnip, closing #7043. fossil-db results are mixed: fossil-db: Totals from 697 (4.65% of 14988) affected shaders: MaxWaves: 10712 -> 10736 (+0.22%) Instrs: 82394 -> 83572 (+1.43%); split: -1.31%, +2.74% CodeSize: 178280 -> 180118 (+1.03%); split: -0.46%, +1.49% NOPs: 15887 -> 16067 (+1.13%); split: -7.48%, +8.61% MOVs: 1297 -> 1328 (+2.39%); split: -6.86%, +9.25% Full: 3730 -> 3842 (+3.00%); split: -1.80%, +4.80% (ss): 1877 -> 1849 (-1.49%); split: -5.59%, +4.10% (sy): 1249 -> 1255 (+0.48%); split: -1.04%, +1.52% (ss)-stall: 6809 -> 6364 (-6.54%); split: -13.85%, +7.31% (sy)-stall: 17059 -> 17257 (+1.16%); split: -6.51%, +7.67% Cat0: 17220 -> 17400 (+1.05%); split: -6.90%, +7.94% Cat1: 5307 -> 6366 (+19.95%); split: -6.93%, +26.89% Cat2: 39138 -> 39101 (-0.09%); split: -0.31%, +0.22% Cat3: 16772 -> 16741 (-0.18%) Cat5: 1269 -> 1276 (+0.55%) I tried to pick some apps to test that looked the most impacted, and indeed the results are mixed: cookie_run_kingdom: +0.275514% +/- 0.0883816% (n=68) trex_200: +0.0943847% +/- 0.0297073% (n=1463) command_and_conquer_rivals: no difference (n=131) war_planet_online: no difference (n=120) lego_legacy: -0.192131% +/- 0.152083% (n=99) among_us: -0.625227% +/- 0.385419% (n=60) Given that the perf results are small and go both ways, and apparently we're an outlier in not always lowering mediump inputs to 16-bit, just do it for consistency with other drivers. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18506>
This commit is contained in:
parent
f4857591e1
commit
a1c765a44c
3 changed files with 39 additions and 3 deletions
|
|
@ -1178,6 +1178,8 @@ image_from_yuv_textures,-1
|
|||
image_scale_aligned,-1
|
||||
imagealphathreshold_image,-1
|
||||
imageblur,-1
|
||||
# A few pixels at the edge of the blur, nothing more noticeable at those points than the jaggedness elsewhere.
|
||||
imageblurclampmode,180
|
||||
imagefilters_xfermodes,-1
|
||||
imagefiltersbase,-1
|
||||
imagefiltersclipped,-1
|
||||
|
|
|
|||
|
|
@ -207,7 +207,7 @@ traces:
|
|||
freedreno-a530:
|
||||
checksum: 70e18ba06d56fea277cd3fb000729879
|
||||
freedreno-a630:
|
||||
checksum: 9eb6d261c0c5946d8aeb0f41b2b7c1b1
|
||||
checksum: 92944a4996ea019f51cfcdeaf4fc6926
|
||||
|
||||
glmark2/desktop:windows=4:effect=shadow-v2.trace:
|
||||
freedreno-a306:
|
||||
|
|
@ -223,7 +223,7 @@ traces:
|
|||
freedreno-a530:
|
||||
checksum: 7712c8143a244c56922124f4ac207722
|
||||
freedreno-a630:
|
||||
checksum: 7712c8143a244c56922124f4ac207722
|
||||
checksum: 8a3b04aff4848fd36de9d956f24fac2f
|
||||
|
||||
glmark2/effect2d:kernel=1,1,1,1,1;1,1,1,1,1;1,1,1,1,1;-v2.trace:
|
||||
freedreno-a306:
|
||||
|
|
@ -231,7 +231,7 @@ traces:
|
|||
freedreno-a530:
|
||||
checksum: 50f1841f2bb96905c9fcd1815c4a95c0
|
||||
freedreno-a630:
|
||||
checksum: 50f1841f2bb96905c9fcd1815c4a95c0
|
||||
checksum: 05e7d72609840c897f734dca03748ead
|
||||
|
||||
glmark2/function:fragment-steps=5:fragment-complexity=low-v2.trace:
|
||||
freedreno-a306:
|
||||
|
|
|
|||
|
|
@ -463,6 +463,40 @@ ir3_nir_post_finalize(struct ir3_shader *shader)
|
|||
|
||||
if (compiler->gen >= 6 && s->info.stage == MESA_SHADER_FRAGMENT &&
|
||||
!(ir3_shader_debug & IR3_DBG_NOFP16)) {
|
||||
/* Lower FS mediump inputs to 16-bit. If you declared it mediump, you
|
||||
* probably want 16-bit instructions (and have set
|
||||
* mediump/RelaxedPrecision on most of the rest of the shader's
|
||||
* instructions). If we don't lower it in NIR, then comparisons of the
|
||||
* results of mediump ALU ops with the mediump input will happen in highp,
|
||||
* causing extra conversions (and, incidentally, causing
|
||||
* dEQP-GLES2.functional.shaders.algorithm.rgb_to_hsl_fragment on ANGLE to
|
||||
* fail)
|
||||
*
|
||||
* However, we can't do flat inputs because flat.b doesn't have the
|
||||
* destination type for how to downconvert the
|
||||
* 32-bit-in-the-varyings-interpolator value. (also, even if it did, watch
|
||||
* out for how gl_nir_lower_packed_varyings packs all flat-interpolated
|
||||
* things together as ivec4s, so when we lower a formerly-float input
|
||||
* you'd end up with an incorrect f2f16(i2i32(load_input())) instead of
|
||||
* load_input).
|
||||
*/
|
||||
uint64_t mediump_varyings = 0;
|
||||
nir_foreach_shader_in_variable(var, s) {
|
||||
if ((var->data.precision == GLSL_PRECISION_MEDIUM ||
|
||||
var->data.precision == GLSL_PRECISION_LOW) &&
|
||||
var->data.interpolation != INTERP_MODE_FLAT) {
|
||||
mediump_varyings |= BITFIELD64_BIT(var->data.location);
|
||||
}
|
||||
}
|
||||
|
||||
if (mediump_varyings) {
|
||||
NIR_PASS_V(s, nir_lower_mediump_io,
|
||||
nir_var_shader_in,
|
||||
mediump_varyings,
|
||||
false);
|
||||
}
|
||||
|
||||
/* This should come after input lowering, to opportunistically lower non-mediump outputs. */
|
||||
NIR_PASS_V(s, nir_lower_mediump_io, nir_var_shader_out, 0, false);
|
||||
}
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue