mesa/src/intel
Kenneth Graunke 837c441acb intel/nir: Don't needlessly split u2f16 for nir_type_uint32
Commit f695a9fed2 moved the 64-bit float <-> 16-bit float conversion
splitting into a core NIR pass, so the code remaining here is only
needed for 64-bit integer types.

Presumably in an attempt to remove the float handling, it replaced
simple bit_size == 64 checks with this expression:

   (full_type & (nir_type_int64 | nir_type_uint64))

I believe that the intended expression was:

   (full_type == nir_type_int64 || full_type == nir_type_uint64)

Unfortunately, the former is incorrect.  Any integer or unsigned
NIR type would trigger the former expression.  For example:

   nir_type_uint32 & (nir_type_int64 | nir_type_uint64) => nir_type_uint

This meant that we were splitting e.g. u2f16 on 32-bit unsigned types
into u2f32 and f2f16, when we can easily natively handle that case.

To fix this, we go back to simple bit_size == 64 checks.  This pass is
already run after nir_lower_fp16_casts which will split the float case,
so we will never see it here.

fossil-db on Alchemist shows a -1.14% reduction in affected shaders for
google-meet-clvk shaders.  In another ChromeOS workload, it improves
performance by around 8% on Meteorlake.

Thanks to Sushma Venkatesh Reddy for finding this performance issue!

Fixes: f695a9fed2 ("intel/compiler: use nir_lower_fp16_casts")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30091>
2024-07-11 02:37:05 -07:00
..
blorp blorp: Ensure MSAA fast clear in correct modes (xe2) 2024-07-02 19:03:19 +00:00
ci ci: simplify setting .no-auto-retry now that it isn't bundled with unrelated rules: 2024-07-07 19:31:44 +00:00
common intel/common: fix building error in intel_common.c 2024-07-02 23:35:26 +00:00
compiler intel/nir: Don't needlessly split u2f16 for nir_type_uint32 2024-07-11 02:37:05 -07:00
decoder intel/decoder: Add intel_print_group_custom_spacing() 2024-04-24 17:07:50 +00:00
dev intel/dev: Replace intel_device_info::apply_hwconfig by a gfx version check 2024-07-03 22:17:37 +00:00
ds intel/ds: remove duplicate arguments 2024-07-03 21:10:13 +00:00
genxml build: pass licensing information in SPDX form 2024-06-29 12:42:49 -07:00
isl isl: Add some formats not covered in CMF table (xe2) 2024-07-02 19:03:19 +00:00
nullhw-layer build: pass licensing information in SPDX form 2024-06-29 12:42:49 -07:00
perf intel/perf/xe: Fix free pointer location in xe_add_config() 2024-07-05 00:25:03 -07:00
shaders meson: use glslang --depfile argument when possible 2024-05-20 17:34:17 +00:00
tools build: pass licensing information in SPDX form 2024-06-29 12:42:49 -07:00
vulkan anv: fix u_trace on < Gfx12.0 2024-07-03 21:10:13 +00:00
vulkan_hasvk hasvk: pass anv_address to predicate helper 2024-07-03 21:10:13 +00:00
meson.build build: pass licensing information in SPDX form 2024-06-29 12:42:49 -07:00