Commit graph

7 commits

Author SHA1 Message Date
Caio Oliveira
db8022dc4d intel/brw: Use helper to create accumulator register
This ensure the region triple <V,W,H> is set correctly, in this case the
desired region is a sequential like <8,8,1>.  Without the helper the
sequence we get is <0,1,0> -- which the generator currently partially
adjusts when emitting code, but is not sufficient when doing validation
earlier.

The code generated code is slightly modified.  From crucible test
func.shader.subtractSaturate.uint in the fragment shader for SIMD8, the
diff looks like

```
 mov(8)          acc0<1>UD       g21<8,8,1>UD                    { align1 1Q $0.dst };
-add.sat(8)      g22<1>UD        -acc0<0,1,0>UD  g16<8,8,1>UD    { align1 1Q @1 $0.dst };
+add.sat(8)      g22<1>UD        -acc0<8,8,1>UD  g16<8,8,1>UD    { align1 1Q @1 $0.dst };
```

Note that without the patch generator adjusted the hstride for acc0 used
as destination (see brw_set_dest), but kept the src region as is.  For
the source, it is not clear to me why the <0,1,0> would work correctly
here since it is a scalar, but using <8,8,1> it is correct.

Fixes: 58907568ec ("intel/fs: Add SHADER_OPCODE_[IU]SUB_SAT pseudo-ops")
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28059>
2024-03-13 03:23:30 +00:00
Kenneth Graunke
1c1e79d75a intel/brw: Copy the smaller payload in fixup_sends_duplicate_payload
Sometimes one source can be a larger register than the other, especially
since opt_register_coalesce can sometimes coalesce those sources into
larger registers.

Copy the smaller of mlen and ex_mlen.  It's less copying.

shader-db and fossil-db on Alchemist show 47 shaders affected with
small 1-2 instruction improvements each, and no regressions.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27876>
2024-03-05 11:39:26 +00:00
Caio Oliveira
d9552fccf2 intel/brw: Remove extra stage_prog_data field in fs_visitor
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27861>
2024-02-29 19:28:06 +00:00
Caio Oliveira
8f3c52c1da intel/brw: Remove MRF type
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27691>
2024-02-28 05:45:39 +00:00
Caio Oliveira
7ac5696157 intel/brw: Remove Gfx8- code from backend passes
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27691>
2024-02-28 05:45:38 +00:00
Ian Romanick
8fb37ef985 intel/fs: Add fast path for ballot(true)
This doesn't help very much now. A later commit adds a NIR optimization
pass, tentatively called nir_opt_uniform_subgroup, that converts many
kinds of subgroup operations to things involving
bitCount(ballot(true)). This commit makes a huge difference in the
results of that later commit.

No shader-db changes on any Intel platform.

Fossil-db results:

All Intel platforms had similar results. (Ice Lake shown)
Totals:
Instrs: 165558033 -> 165557519 (-0.00%)
Cycles: 15156188362 -> 15156178922 (-0.00%); split: -0.00%, +0.00%

Totals from 299 (0.05% of 656117) affected shaders:
Instrs: 88293 -> 87779 (-0.58%)
Cycles: 3709498 -> 3700058 (-0.25%); split: -0.28%, +0.03%

v2: Rebase on splitting ELK from BRW. Remove devinfo->ver >= 8 check.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27044>
2024-02-27 08:37:46 -08:00
Caio Oliveira
4fe3498e72 intel/brw: Move small lowering passes into brw_fs_lower.cpp
Larger lowering passes will go to their own files.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26887>
2024-02-26 20:54:25 +00:00