Commit graph

1811 commits

Author SHA1 Message Date
Ian Romanick
f0a8a9816a nir: intel/compiler: Add and use nir_op_pack_32_4x8_split
A lot of CTS tests write a u8vec4 or an i8vec4 to an SSBO.  This results
in a lot of shifts and MOVs.  When that pattern can be recognized, the
individual 8-bit components can be packed much more efficiently.

v2: Rebase on b4369de27f ("nir/lower_packing: use
shader_instructions_pass")

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>
2021-08-18 22:03:37 +00:00
Ian Romanick
7c83aa0518 intel/fs: Emit better code for u2u of extract
Emitting the instructions one by one results in two MOV instructions
that won't be propagated.  By handling both instructions at once, a
single MOV is emitted.  For example, on Ice Lake this helps
dEQP-VK.spirv_assembly.type.vec3.i8.bitwise_xor_frag:

SIMD8 shader: 49 instructions. 1 loops. 4044 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 41 instructions. 1 loops. 3804 cycles. 0:0 spills:fills, 5 sends

Without "intel/fs: Allow copy propagation between MOVs of mixed sizes,"
the improvement is still 8 instructions, but there are more instructions
to begin with:

SIMD8 shader: 52 instructions. 1 loops. 4164 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 44 instructions. 1 loops. 3944 cycles. 0:0 spills:fills, 5 sends

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>
2021-08-18 22:03:37 +00:00
Ian Romanick
e3f502e007 intel/fs: Allow copy propagation between MOVs of mixed sizes
This eliminates some spurious, size-converting moves.  For example, on
Ice Lake this helps dEQP-VK.spirv_assembly.type.vec3.i8.bitwise_xor_frag:

SIMD8 shader: 52 instructions. 1 loops. 4164 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 49 instructions. 1 loops. 4044 cycles. 0:0 spills:fills, 5 sends

Unfortunately, this doesn't clean everything up.  Here's a subset of the
"before" assembly:

send(8)         g11<1>UW        g2<0,1,0>UD     0x02106e02
                            dp data 1 MsgDesc: ( untyped surface read, Surface = 2, SIMD8, Mask = 0xe) mlen 1 rlen 1 { align1 1Q };
mov(8)          g7<4>UB         g11<8,8,1>UD                    { align1 1Q };
mov(8)          g12<1>UB        g7<32,8,4>UB                    { align1 1Q };
send(8)         g13<1>UW        g2<0,1,0>UD     0x02106e03
                            dp data 1 MsgDesc: ( untyped surface read, Surface = 3, SIMD8, Mask = 0xe) mlen 1 rlen 1 { align1 1Q };
mov(8)          g15<1>UW        g12<8,8,1>UB                    { align1 1Q };
mov(8)          g8<4>UB         g13<8,8,1>UD                    { align1 1Q };
mov(8)          g14<1>UB        g8<32,8,4>UB                    { align1 1Q };
mov(8)          g16<1>UW        g14<8,8,1>UB                    { align1 1Q };
xor(8)          g17<1>UW        g15<8,8,1>UW    g16<8,8,1>UW    { align1 1Q };

And here's the same subset of the "after" assembly:

send(8)         g11<1>UW        g2<0,1,0>UD     0x02106e02
                            dp data 1 MsgDesc: ( untyped surface read, Surface = 2, SIMD8, Mask = 0xe) mlen 1 rlen 1 { align1 1Q };
mov(8)          g7<4>UB         g11<8,8,1>UD                    { align1 1Q };
send(8)         g13<1>UW        g2<0,1,0>UD     0x02106e03
                            dp data 1 MsgDesc: ( untyped surface read, Surface = 3, SIMD8, Mask = 0xe) mlen 1 rlen 1 { align1 1Q };
mov(8)          g15<1>UW        g7<32,8,4>UB                    { align1 1Q };
mov(8)          g8<4>UB         g13<8,8,1>UD                    { align1 1Q };
mov(8)          g16<1>UW        g8<32,8,4>UB                    { align1 1Q };
xor(8)          g17<1>UW        g15<8,8,1>UW    g16<8,8,1>UW    { align1 1Q };

There are a lot of regioning and type restrictions in
fs_visitor::try_copy_propagate, and I'm a little nervious about messing
with them too much.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Suggested-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>
2021-08-18 22:03:37 +00:00
Ian Romanick
f9665040f1 intel/compiler: Document and assert some aspects of 8-bit integer lowering
In the vec4 compiler, 8-bit types should never exist.

In the scalar compiler, 8-bit types should only ever be able to exist on
Gfx ver 8 and 9.

Some instructions are handled in non-obvious ways.

Hopefully this will save the next person some time.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>
2021-08-18 22:03:37 +00:00
Rhys Perry
ed70b256ce nir: add ffma creation helpers
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8056>
2021-08-16 17:19:45 +00:00
Eric Engestrom
f1eae2f8bb python: drop python2 support
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Acked-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3674>
2021-08-14 21:44:32 +00:00
Jason Ekstrand
c858d30833 intel/vec4: Don't override emit_urb_write_opcode for SNB GS
The gfx6_gs_visitor overrides emit_urb_write_opcode but with a different
function signature.  This causes warnings with -Woverloaded-virtual.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12308>
2021-08-11 23:57:52 +00:00
Dave Airlie
8a81d14271 intel/vec4: sel.cond writes the flags on Gfx4 and Gfx5
This is the equivalent of idr's
intel/fs: sel.cond writes the flags on Gfx4 and Gfx5

except for the vec4 backend.

This fixes buggy rendering seen with crocus on a qt trace.

v2 (idr): Trivial whitespace change.  Add unit tests.

v3: Fix type in comment in unit tests.  Noticed by Jason and Priit.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

Iron Lake
total instructions in shared programs: 8183077 -> 8184543 (0.02%)
instructions in affected programs: 198990 -> 200456 (0.74%)
helped: 0
HURT: 1355
HURT stats (abs)   min: 1 max: 8 x̄: 1.08 x̃: 1
HURT stats (rel)   min: 0.29% max: 6.00% x̄: 0.99% x̃: 0.70%
95% mean confidence interval for instructions value: 1.04 1.12
95% mean confidence interval for instructions %-change: 0.96% 1.03%
Instructions are HURT.

total cycles in shared programs: 238967672 -> 238962784 (<.01%)
cycles in affected programs: 4666014 -> 4661126 (-0.10%)
helped: 406
HURT: 314
helped stats (abs) min: 4 max: 54 x̄: 22.46 x̃: 18
helped stats (rel) min: <.01% max: 12.80% x̄: 1.82% x̃: 0.65%
HURT stats (abs)   min: 2 max: 112 x̄: 13.48 x̃: 12
HURT stats (rel)   min: <.01% max: 7.82% x̄: 0.81% x̃: 0.16%
95% mean confidence interval for cycles value: -8.60 -4.98
95% mean confidence interval for cycles %-change: -0.87% -0.49%
Cycles are helped.

GM45
total instructions in shared programs: 4986888 -> 4988354 (0.03%)
instructions in affected programs: 198990 -> 200456 (0.74%)
helped: 0
HURT: 1355
HURT stats (abs)   min: 1 max: 8 x̄: 1.08 x̃: 1
HURT stats (rel)   min: 0.29% max: 6.00% x̄: 0.99% x̃: 0.70%
95% mean confidence interval for instructions value: 1.04 1.12
95% mean confidence interval for instructions %-change: 0.96% 1.03%
Instructions are HURT.

total cycles in shared programs: 153577826 -> 153572938 (<.01%)
cycles in affected programs: 4666014 -> 4661126 (-0.10%)
helped: 406
HURT: 314
helped stats (abs) min: 4 max: 54 x̄: 22.46 x̃: 18
helped stats (rel) min: <.01% max: 12.80% x̄: 1.82% x̃: 0.65%
HURT stats (abs)   min: 2 max: 112 x̄: 13.48 x̃: 12
HURT stats (rel)   min: <.01% max: 7.82% x̄: 0.81% x̃: 0.16%
95% mean confidence interval for cycles value: -8.60 -4.98
95% mean confidence interval for cycles %-change: -0.87% -0.49%
Cycles are helped.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12191>
2021-08-11 13:09:32 -07:00
Ian Romanick
38807ceeae intel/fs: sel.cond writes the flags on Gfx4 and Gfx5
On Gfx4 and Gfx5, sel.l (for min) and sel.ge (for max) are implemented
using a separte cmpn and sel instruction.  This lowering occurs in
fs_vistor::lower_minmax which is called very, very late... a long, long
time after the first calls to opt_cmod_propagation.  As a result,
conditional modifiers can be incorrectly propagated across sel.cond on
those platforms.

No tests were affected by this change, and I find that quite shocking.
After just changing flags_written(), all of the atan tests started
failing on ILK.  That required the change in cmod_propagatin (and the
addition of the prop_across_into_sel_gfx5 unit test).

Shader-db results for ILK and GM45 are below.  I looked at a couple
before and after shaders... and every case that I looked at had
experienced incorrect cmod propagation.  This affected a LOT of apps!
Euro Truck Simulator 2, The Talos Principle, Serious Sam 3, Sanctum 2,
Gang Beasts, and on and on... :(

I discovered this bug while working on a couple new optimization
passes.  One of the passes attempts to remove condition modifiers that
are never used.  The pass made no progress except on ILK and GM45.
After investigating a couple of the affected shaders, I noticed that
the code in those shaders looked wrong... investigation led to this
cause.

v2: Trivial changes in the unit tests.

v3: Fix type in comment in unit tests.  Noticed by Jason and Priit.

v4: Tweak handling of BRW_OPCODE_SEL special case.  Suggested by Jason.

Fixes: df1aec763e ("i965/fs: Define methods to calculate the flag subset read or written by an fs_inst.")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Dave Airlie <airlied@redhat.com>

Iron Lake
total instructions in shared programs: 8180493 -> 8181781 (0.02%)
instructions in affected programs: 541796 -> 543084 (0.24%)
helped: 28
HURT: 1158
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.35% max: 0.86% x̄: 0.53% x̃: 0.50%
HURT stats (abs)   min: 1 max: 3 x̄: 1.14 x̃: 1
HURT stats (rel)   min: 0.12% max: 4.00% x̄: 0.37% x̃: 0.23%
95% mean confidence interval for instructions value: 1.06 1.11
95% mean confidence interval for instructions %-change: 0.31% 0.38%
Instructions are HURT.

total cycles in shared programs: 239420470 -> 239421690 (<.01%)
cycles in affected programs: 2925992 -> 2927212 (0.04%)
helped: 49
HURT: 157
helped stats (abs) min: 2 max: 284 x̄: 62.69 x̃: 70
helped stats (rel) min: 0.04% max: 6.20% x̄: 1.68% x̃: 1.96%
HURT stats (abs)   min: 2 max: 48 x̄: 27.34 x̃: 24
HURT stats (rel)   min: 0.02% max: 2.91% x̄: 0.31% x̃: 0.20%
95% mean confidence interval for cycles value: -0.80 12.64
95% mean confidence interval for cycles %-change: -0.31% <.01%
Inconclusive result (value mean confidence interval includes 0).

GM45
total instructions in shared programs: 4985517 -> 4986207 (0.01%)
instructions in affected programs: 306935 -> 307625 (0.22%)
helped: 14
HURT: 625
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.35% max: 0.82% x̄: 0.52% x̃: 0.49%
HURT stats (abs)   min: 1 max: 3 x̄: 1.13 x̃: 1
HURT stats (rel)   min: 0.12% max: 3.90% x̄: 0.34% x̃: 0.22%
95% mean confidence interval for instructions value: 1.04 1.12
95% mean confidence interval for instructions %-change: 0.29% 0.36%
Instructions are HURT.

total cycles in shared programs: 153827268 -> 153828052 (<.01%)
cycles in affected programs: 1669290 -> 1670074 (0.05%)
helped: 24
HURT: 84
helped stats (abs) min: 2 max: 232 x̄: 64.33 x̃: 67
helped stats (rel) min: 0.04% max: 4.62% x̄: 1.60% x̃: 1.94%
HURT stats (abs)   min: 2 max: 48 x̄: 27.71 x̃: 24
HURT stats (rel)   min: 0.02% max: 2.66% x̄: 0.34% x̃: 0.14%
95% mean confidence interval for cycles value: -1.94 16.46
95% mean confidence interval for cycles %-change: -0.29% 0.11%
Inconclusive result (value mean confidence interval includes 0).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12191>
2021-08-11 13:09:20 -07:00
Marcin Ślusarz
7810ca596c intel/compiler: use nir_shader_instructions_pass in brw_nir_apply_attribute_workarounds
Changes:
- removal of attr_wa_state (it's passed directly)
- nir_metadata_preserve(..., nir_metadata_all) is called when pass doesn't
  make progress

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12324>
2021-08-11 11:23:30 +00:00
Jason Ekstrand
11ac7d9e02 intel/eu: Set scope to TILE for TGM flushes
Setting it to GPU can cause an L3$ flush in certain cases.  That's not
what we want as we really only care about coherency within the GPU.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Sagar Ghuge <sagar@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12291>
2021-08-10 14:00:19 +00:00
Lionel Landwerlin
97be8e42e4 intel/disasm: fix missing oword index decoding
Also switch to array of strings to show high/low dwords.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: daba2894ff ("intel/disasm: decode/describe more send messages")
Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12183>
2021-08-03 17:26:37 +00:00
Timothy Arceri
a9ed4538ab nir: add indirect loop unrolling to compiler options
This is where it should be rather than having to pass it into the
optimisation pass every time.

It also allows us to call the loop analysis pass without having to
duplicate these options which we will do later in this series.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12064>
2021-08-03 10:54:50 +00:00
Timothy Arceri
a654e39f15 intel/compiler: make sure swizzle is applied to if condition
This fixes a hang in the following piglit test when GCM moves a
UBO load outside of the loop.

tests/shaders/ssa/fs-if-def-else-break.shader_test

The end NIR ends up looking like this:

	vec2 32 ssa_3 = intrinsic load_ubo (ssa_2, ssa_0) (0, 1073741824, 0, 0, 8)
	vec1 32 ssa_4 = mov ssa_3.x
	vec1 32 ssa_5 = inot ssa_3.y
	/* succs: block_1 */
	loop {
           ...
           if ssa_5 { }
        }

Fixes: 1edf67fc3f ("intel/fs: Generate if instructions with inverted conditions")

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12064>
2021-08-03 10:54:50 +00:00
Ian Romanick
5ffbee84a4 intel/compiler: Add id parameter to shader_perf_log callback
There are two problems with the current architecture.

In OpenGL, the id is supposed to be a unique identifier for a particular
log source.  This is done so that applications can (theoretically)
filter particular log messages.  The debug callback infrastructure in
Mesa assigns a uniqe value when a value of 0 is passed in.  This causes
the id to get set once to a unique value for each message.

By passing a stack variable that is initialized to 0 on every call,
every time the same message is logged, it will have a different id.
This isn't great, but it's also not catastrophic.

When threaded shader compiles are used, the id *pointer* is saved and
dereferenced at a possibly much later time on a possibly different
thread.  This causes one thread to access the stack from a different
thread... and that stack frame might not be valid any more. :(

I have not observed any crashes related to this particular issue.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12136>
2021-08-01 23:58:08 +00:00
Ian Romanick
043c5bf966 intel/compiler: Add id parameter to shader_debug_log callback
There are two problems with the current architecture.

In OpenGL, the id is supposed to be a unique identifier for a particular
log source.  This is done so that applications can (theoretically)
filter particular log messages.  The debug callback infrastructure in
Mesa assigns a uniqe value when a value of 0 is passed in.  This causes
the id to get set once to a unique value for each message.

By passing a stack variable that is initialized to 0 on every call,
every time the same message is logged, it will have a different id.
This isn't great, but it's also not catastrophic.

When threaded shader compiles are used, the id *pointer* is saved and
dereferenced at a possibly much later time on a possibly different
thread.  This causes one thread to access the stack from a different
thread... and that stack frame might not be valid any more. :(

This fixes shader-db crashes of various kinds on Iris with threaded
shader compiles enabled.

Fixes: 42c34e1ac8 ("iris: Enable threaded shader compilation")
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12136>
2021-08-01 23:58:08 +00:00
Dave Airlie
c8783001c7 intel/fs: restrict max push length on older GPUs to a smaller amount
Fixes crash in dEQP-GLES2.functional.uniform_api.random.79

Cc: mesa-stable
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12093>
2021-07-30 15:17:21 +10:00
Jason Ekstrand
74ec2b12be nir/lower_tex: Rework invalid implicit LOD lowering
Only fragment and some compute shaders support implicit derivatives.
They're totally meaningless without helper invocations and some
understanding of the dispatch pattern.  We've got code to lower
nir_texop_tex in these shader stages to use an explicit derivative of 0
but it was pretty badly broken:

 1. It only handled nir_texop_tex, not nir_texop_txb or nir_texop_lod.

 2. It didn't take min_lod into account

 3. It was conflated with adding a missing LOD parameter to opcodes
    which expect one such as nir_texop_txf.  While not really a bug,
    this does make it way harder to reason about the code.

 4. Unless you set a flag (which most drivers don't), it left the
    opcode nir_texop_tex instead of nir_texop_txl which it should have
    been.

This reworks it to go through roughly the same path as other LOD
lowering only with a constant lod of 0 instead of calling out to
nir_texop_lod.  We also get rid of the lower_tex_without_implicit_lod
flag because most drivers set it and those that don't are probably
subtly broken.  If someone really wants to get nir_texop_tex in their
vertex shaders, they can write a new patch to add the flag back in.

Fixes: e382890e25 "nir: set default lod to texture opcodes that..."
Fixes: d5ac5d6e83 "nir: Add option to lower tex to txl when..."
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11775>
2021-07-23 15:53:57 +00:00
Jason Ekstrand
4465ca296d nir: Suffix all the MCS texture stuff _intel
It's intel-specific, used to get at MSAA compression information.

Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11775>
2021-07-23 15:53:57 +00:00
Sagar Ghuge
ef29bb6bc5 intel/compiler: Handle ternary add in lower_simd_width
We need to lower the add3 instruction simd width otherwise in simd32
mode, we endup writing 4 register wide data which is not allowed.

Reported-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11985>
2021-07-22 23:38:04 +00:00
Sagar Ghuge
0608e76e00 intel/compiler: Fix missing break in switch
CoverityID: 1487496

Fixes: cde9ca616d "intel/compiler: Make decision based on source type instead of opcode"
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11985>
2021-07-22 23:38:04 +00:00
Jason Ekstrand
929558776e intel/eu: Don't validate LSC transpose on ops that don't have it
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11991>
2021-07-22 21:06:33 +00:00
Jordan Justen
8c29891fa4 intel/compiler: Remove cube array size lowering in compiler backend
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9466>
2021-07-21 11:34:49 -07:00
Jordan Justen
e2a30ebf44 intel/compiler: Lower cube image sizes using nir_lower_image()
Reworks:
 * Re-merge early/late passes using Jason's nir image deref patches
 * Create and use a common nir_lower_image() pass. (s-b Jason)
 * Remove cube array size handling in image load/store lowering

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9466>
2021-07-21 11:34:49 -07:00
Jordan Justen
b5514a2236 intel/compiler: Rename brw_nir_lower_image_load_store to brw_nir_lower_storage_image
Reworks:
 * Add crocus

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9466>
2021-07-21 11:02:15 -07:00
Timothy Arceri
9f3fde6f36 intel/compiler: Use GCM in nir_optimize
There is still some work to do before we can enable GVN.

In these shader-db results, Skylake and older platforms used i965 while
newer platforms used Iris.  I believe this accounts for the difference
in "sends."  The shaders helped for sends are all Synmark shaders.

On Sandybridge, the shaders helped for sends were the same ones hurt for
spills and fills.  These are also all Synmark shaders.

Tiger Lake and Ice Lake had similar results. (Ice Lake shown)
total instructions in shared programs: 19868594 -> 19868452 (<.01%)
instructions in affected programs: 6607 -> 6465 (-2.15%)
helped: 12
HURT: 1
helped stats (abs) min: 12 max: 12 x̄: 12.00 x̃: 12
helped stats (rel) min: 1.94% max: 2.62% x̄: 2.38% x̃: 2.58%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.45% max: 0.45% x̄: 0.45% x̃: 0.45%
95% mean confidence interval for instructions value: -13.27 -8.58
95% mean confidence interval for instructions %-change: -2.67% -1.65%
Instructions are helped.

total cycles in shared programs: 962404540 -> 962008224 (-0.04%)
cycles in affected programs: 961274 -> 564958 (-41.23%)
helped: 23
HURT: 1
helped stats (abs) min: 10 max: 32536 x̄: 17438.96 x̃: 23658
helped stats (rel) min: 0.02% max: 80.04% x̄: 42.05% x̃: 51.58%
HURT stats (abs)   min: 4780 max: 4780 x̄: 4780.00 x̃: 4780
HURT stats (rel)   min: 3.26% max: 3.26% x̄: 3.26% x̃: 3.26%
95% mean confidence interval for cycles value: -22989.90 -10036.43
95% mean confidence interval for cycles %-change: -55.01% -25.32%
Cycles are helped.

Skylake and Broadwell had simliar results. (Skylake shown)
total instructions in shared programs: 17996652 -> 17996154 (<.01%)
instructions in affected programs: 96622 -> 96124 (-0.52%)
helped: 85
HURT: 0
helped stats (abs) min: 1 max: 12 x̄: 5.86 x̃: 5
helped stats (rel) min: 0.39% max: 2.65% x̄: 0.68% x̃: 0.39%
95% mean confidence interval for instructions value: -6.42 -5.30
95% mean confidence interval for instructions %-change: -0.84% -0.52%
Instructions are helped.

total cycles in shared programs: 939899189 -> 939289732 (-0.06%)
cycles in affected programs: 3719430 -> 3109973 (-16.39%)
helped: 60
HURT: 39
helped stats (abs) min: 18 max: 32444 x̄: 10437.30 x̃: 6940
helped stats (rel) min: 0.08% max: 80.40% x̄: 23.99% x̃: 12.07%
HURT stats (abs)   min: 10 max: 4970 x̄: 430.28 x̃: 323
HURT stats (rel)   min: 0.05% max: 3.41% x̄: 1.55% x̃: 1.60%
95% mean confidence interval for cycles value: -8095.51 -4216.75
95% mean confidence interval for cycles %-change: -18.65% -9.21%
Cycles are helped.

total sends in shared programs: 1026997 -> 1026927 (<.01%)
sends in affected programs: 6090 -> 6020 (-1.15%)
helped: 70
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 1.15% max: 1.15% x̄: 1.15% x̃: 1.15%
95% mean confidence interval for sends value: -1.00 -1.00
95% mean confidence interval for sends %-change: -1.15% -1.15%
Sends are helped.

Haswell and Ivy Bridge had similar results. (Haswell shown)
total instructions in shared programs: 16040891 -> 16040252 (<.01%)
instructions in affected programs: 109132 -> 108493 (-0.59%)
helped: 87
HURT: 0
helped stats (abs) min: 1 max: 12 x̄: 7.34 x̃: 7
helped stats (rel) min: 0.05% max: 2.61% x̄: 0.75% x̃: 0.51%
95% mean confidence interval for instructions value: -7.84 -6.85
95% mean confidence interval for instructions %-change: -0.90% -0.61%
Instructions are helped.

total cycles in shared programs: 968579567 -> 967867117 (-0.07%)
cycles in affected programs: 30688439 -> 29975989 (-2.32%)
helped: 241
HURT: 62
helped stats (abs) min: 4 max: 31929 x̄: 3901.22 x̃: 2282
helped stats (rel) min: 0.04% max: 79.63% x̄: 12.70% x̃: 4.44%
HURT stats (abs)   min: 4 max: 8230 x̄: 3673.27 x̃: 637
HURT stats (rel)   min: 0.01% max: 63.87% x̄: 24.54% x̃: 3.56%
95% mean confidence interval for cycles value: -3100.23 -1602.41
95% mean confidence interval for cycles %-change: -7.94% -2.22%
Cycles are helped.

total sends in shared programs: 935025 -> 934955 (<.01%)
sends in affected programs: 6090 -> 6020 (-1.15%)
helped: 70
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 1.15% max: 1.15% x̄: 1.15% x̃: 1.15%
95% mean confidence interval for sends value: -1.00 -1.00
95% mean confidence interval for sends %-change: -1.15% -1.15%
Sends are helped.

LOST:   1
GAINED: 0

Sandy Bridge
total instructions in shared programs: 11785330 -> 11786504 (<.01%)
instructions in affected programs: 53462 -> 54636 (2.20%)
helped: 16
HURT: 36
helped stats (abs) min: 1 max: 17 x̄: 10.06 x̃: 9
helped stats (rel) min: 0.47% max: 3.29% x̄: 2.03% x̃: 1.90%
HURT stats (abs)   min: 5 max: 38 x̄: 37.08 x̃: 38
HURT stats (rel)   min: 1.77% max: 2.98% x̄: 2.94% x̃: 2.98%
95% mean confidence interval for instructions value: 16.26 28.90
95% mean confidence interval for instructions %-change: 0.75% 2.08%
Instructions are HURT.

total cycles in shared programs: 498009911 -> 497378300 (-0.13%)
cycles in affected programs: 6848277 -> 6216666 (-9.22%)
helped: 108
HURT: 28
helped stats (abs) min: 4 max: 25394 x̄: 6037.42 x̃: 766
helped stats (rel) min: 0.02% max: 60.58% x̄: 11.60% x̃: 4.83%
HURT stats (abs)   min: 96 max: 6834 x̄: 729.64 x̃: 742
HURT stats (rel)   min: 0.17% max: 16.23% x̄: 1.57% x̃: 1.55%
95% mean confidence interval for cycles value: -5907.99 -3380.40
95% mean confidence interval for cycles %-change: -11.67% -6.11%
Cycles are helped.

total spills in shared programs: 2316 -> 2526 (9.07%)
spills in affected programs: 280 -> 490 (75.00%)
helped: 0
HURT: 35

total fills in shared programs: 1540 -> 1750 (13.64%)
fills in affected programs: 280 -> 490 (75.00%)
helped: 0
HURT: 35

total sends in shared programs: 642985 -> 642950 (<.01%)
sends in affected programs: 3045 -> 3010 (-1.15%)
helped: 35
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 1.15% max: 1.15% x̄: 1.15% x̃: 1.15%
95% mean confidence interval for sends value: -1.00 -1.00
95% mean confidence interval for sends %-change: -1.15% -1.15%
Sends are helped.

LOST:   1
GAINED: 0

Iron Lake and GM45 had similar results. (Iron Lake shown)
total cycles in shared programs: 239442382 -> 239429400 (<.01%)
cycles in affected programs: 20816 -> 7834 (-62.37%)
helped: 2
HURT: 0

In Fossil-db, all of the shaders hurt for spill and fills are compute
shaders from Shadow of the Tomb Raider.  Two shaders were helped for
sends, and these are also from SotTR.

All of the shaders helped for loops were from Geekbench5.  These all
went from 3 loops to 2.

Tiger Lake
Instructions in all programs: 160852396 -> 160855303 (+0.0%)
SENDs in all programs: 6878559 -> 6878559 (+0.0%)
Loops in all programs: 38350 -> 38305 (-0.1%)
Cycles in all programs: 7369162339 -> 7344236445 (-0.3%)
Spills in all programs: 193762 -> 193876 (+0.1%)
Fills in all programs: 306417 -> 306600 (+0.1%)

Ice Lake
Instructions in all programs: 144592523 -> 144593946 (+0.0%)
SENDs in all programs: 6930697 -> 6930697 (+0.0%)
Loops in all programs: 38344 -> 38299 (-0.1%)
Cycles in all programs: 8732456458 -> 8707823383 (-0.3%)
Spills in all programs: 216692 -> 216806 (+0.1%)
Fills in all programs: 334089 -> 334272 (+0.1%)

Skylake
Instructions in all programs: 135618746 -> 135619971 (+0.0%)
SENDs in all programs: 6896728 -> 6896724 (-0.0%)
Loops in all programs: 38343 -> 38298 (-0.1%)
Cycles in all programs: 8391957144 -> 8368935657 (-0.3%)
Spills in all programs: 194741 -> 194879 (+0.1%)
Fills in all programs: 301048 -> 301255 (+0.1%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/597>
2021-07-21 14:24:00 +00:00
Timothy Arceri
c742a99fb6 intel/compiler: call nir_opt_dead_cf() after we have finished all opts
This will avoid a regression with the following patch.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/597>
2021-07-21 14:24:00 +00:00
Jason Ekstrand
a62973580b intel/eu: Start validating LSC message descriptors
This is certainly not a full validation but it at least gets the
framework in place and validates one hard-to-find restriction.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11657>
2021-07-16 17:25:48 +00:00
Sagar Ghuge
89b7a40212 intel/compiler: Enable has_iadd3 option on XeHP
shader-db result is inconclusive but doesn't harm to include it for
reference.

Shader-db result on XeHPG:

total instructions in shared programs: 1397405 -> 1397315 (<.01%)
instructions in affected programs: 88252 -> 88162 (-0.10%)
helped: 20
HURT: 7
helped stats (abs) min: 1 max: 18 x̄: 7.20 x̃: 7
helped stats (rel) min: 0.03% max: 2.20% x̄: 0.37% x̃: 0.23%
HURT stats (abs)   min: 4 max: 23 x̄: 7.71 x̃: 4
HURT stats (rel)   min: 0.10% max: 0.68% x̄: 0.22% x̃: 0.11%
95% mean confidence interval for instructions value: -6.81 0.14
95% mean confidence interval for instructions %-change: -0.42% -0.02%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 119924219 -> 119931868 (<.01%)
cycles in affected programs: 45029193 -> 45036842 (0.02%)
helped: 11
HURT: 16
helped stats (abs) min: 15 max: 5490 x̄: 1655.73 x̃: 140
helped stats (rel) min: <.01% max: 0.35% x̄: 0.11% x̃: <.01%
HURT stats (abs)   min: 1 max: 2944 x̄: 1616.38 x̃: 1743
HURT stats (rel)   min: <.01% max: 0.17% x̄: 0.09% x̃: 0.10%
95% mean confidence interval for cycles value: -606.11 1172.70
95% mean confidence interval for cycles %-change: -0.04% 0.07%
Inconclusive result (value mean confidence interval includes 0).

v2:
- Include shader-db result (Jason)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11596>
2021-07-16 15:59:56 +00:00
Sagar Ghuge
e6db2299a8 intel/compiler: Allow ternary add to promote source to immediate
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11596>
2021-07-16 15:59:56 +00:00
Sagar Ghuge
cde9ca616d intel/compiler: Make decision based on source type instead of opcode
This patch restructure code a little bit to check if source can be
represented as immediate operand. This is a foundation for next patch
which add checks for integer operand as well.

Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11596>
2021-07-16 15:59:56 +00:00
Sagar Ghuge
705285b9f4 intel/compiler: Add support for ternary add instruction on XeHP
v2:
- Re-arragne opcode in correct order (Matt Turner)
- Move ADD3 case closer to LRP (Jason)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11596>
2021-07-16 15:59:56 +00:00
Jason Ekstrand
6642749458 intel/dev: Add a max_cs_workgroup_threads field
This is distinct form max_cs_threads because it also encodes
restrictions about the way we use GPGPU/COMPUTE_WALKER.  This gets rid
of the MIN2(64, devinfo->max_cs_threads) we have scattered all over the
driver and puts it in a central place.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11861>
2021-07-14 23:02:34 +00:00
Ian Romanick
3cb203303c intel/compiler: Update block IPs once in opt_cmod_propagation
No difference proven at 95.0% confidence (n=10) in
dEQP-VK.ssbo.phys.layout.random.16bit.scalar.13.

v2: Only update each block's IP data once instead of once per block.
Suggested by Emma.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11632>
2021-07-14 09:57:06 -07:00
Ian Romanick
8f1052938d intel/compiler: Update block IPs once in register_coalesce
Performance improvement in
dEQP-VK.ssbo.phys.layout.random.16bit.scalar.13 for n=30:

release build (w/Fedora build flags): -0.82% ± 0.23%
Meson -Dbuildtype=debugoptimized:     -0.74% ± 0.27%

The difference in the debugoptimized build is the calls to
inst_is_in_block(block, this) still exist on each call to remove().

v2: Only update each block's IP data once instead of once per block.
Suggested by Emma.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11632>
2021-07-14 09:57:04 -07:00
Ian Romanick
f3f3817307 intel/compiler: Update block IPs once in dead_code_eliminate
Performance improvement in
dEQP-VK.ssbo.phys.layout.random.16bit.scalar.13 for n=30:

release build (w/Fedora build flags): -7.79% ± 0.25%
Meson -Dbuildtype=debugoptimized:     -5.10% ± 0.40%

The difference in the debugoptimized build is the calls to
inst_is_in_block(block, this) still exist on each call to remove().

v2: Only update each block's IP data once instead of once per block.
Suggested by Emma.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11632>
2021-07-14 09:57:01 -07:00
Ian Romanick
8ca1bc5f94 intel/compiler: Add cfg_t::adjust_block_ips() method
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11632>
2021-07-14 09:56:59 -07:00
Ian Romanick
8206b04d43 intel/compiler: Add the ability to defer IP updates in backend_instruction::remove
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11632>
2021-07-14 09:56:46 -07:00
Jason Ekstrand
3d934ee03f glsl: Delete lower_texture_projection
This is only used by i965 and we've been getting it through
nir_lower_tex since forever.  Get rid of the GLSL IR pass.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11827>
2021-07-13 14:06:33 +00:00
Marcin Ślusarz
f3742b9c13 intel/compiler: document register types
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11677>
2021-07-12 13:27:41 +00:00
Lionel Landwerlin
91dcbf1f56 intel/compiler: Track latency/perf of LSC fences
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11759>
2021-07-12 11:39:03 +00:00
Andrii Simiklit
57f54bb9cc Remove redundant assignment
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4957
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11780>
2021-07-09 09:34:27 +00:00
Jason Ekstrand
624e799cc3 nir: Drop nir_ssa_def::name and nir_register::name
We say that they're for debug only but we don't really have a good
policy around when to set them and when not to.  In particular,
nir_lower_system_values and nir_lower_vars_to_ssa which are the chief
producers of SSA values which might reasonably have a name do not bother
to set one.  We have some names set from things like BLORP and RADV's
meta shaders but AFAICT, they're setting a name more because it's there
than because they actually care.

Also, most things other than nir_clone and nir_serialize don't bother to
try and preserve them.  You can see in the diffstat of this commit
exactly what passes attempt to preserve names.  Notably missing from the
list is opt_algebraic which is the single largest source of SSA def
churn and it happily throws names away.

These observations lead me to question whether or not names are actually
useful at all or if they're just taking up space (8B per instruction)
and wasting CPU cycles (to ralloc_strdup on the off chance we do have
one).  I don't think I can think of a single time in recent history
where I've been debugging a shader issue and a SSA value name has been
there and been useful.  If anything, the few times they are there, they
just throw me off because they mess up the indentation in nir_print.

iris shader-db on my system gets runtime -2.07734% +/- 1.26933% (n=5)

Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5439>
2021-07-08 17:34:41 +00:00
Connor Abbott
e4e79de2a4 nir/subgroups: Support > 1 ballot components
Qualcomm has a mode with a subgroup size of 128, so just emitting larger
integer operations and then lowering them later isn't an option. This
makes the pass able to handle the lowering itself, so that we don't have
to go down to 64-thread wavefronts when ballots are used.

(The GLSL and legacy SPIR-V extensions only support a maximum of 64
threads, but I guess we'll cross that bridge when we come to it...)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
2021-07-08 16:02:41 +00:00
Yevhenii Kolesnikov
974c58b317 intel: fix leaking memory on shader creation
ralloc_adopt takes care of all the shader's children, but shader itsel ends up
orphaned and never gets free'd.

Fixes: ef5bce9253 ("intel: Drop the last uses of a mem_ctx in nir_builder_init_simple_shader().")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4951

Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11651>
2021-06-30 19:34:56 +03:00
Jason Ekstrand
f5876dfdb9 intel/fs: Lower uniform pull constant load message to LSC dataport
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11600>
2021-06-30 16:17:18 +00:00
Sagar Ghuge
6362059b6b intel/fs: Lower varying pull constant load message to LSC dataport
Co-authored-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11600>
2021-06-30 16:17:18 +00:00
Sagar Ghuge
4fca64ad4d intel/fs: Lower A64 atomic messages to LSC dataport
Co-authored-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11600>
2021-06-30 16:17:18 +00:00
Sagar Ghuge
07a4bdf1e8 intel/fs: Lower A64 byte scattered r/w messages to LSC dataport
Co-authored-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11600>
2021-06-30 16:17:18 +00:00
Mark Janes
22d20dbb02 intel/fs: Lower A64 untyped r/w messages to LSC when available
We set the ex_desc to 0, since the address surface type is FLAT.

v2 (Sagar Ghuge):
 - Fix message descriptor encoding

v2 (Jason Ekstrand):
 - Drop support for block messages

Co-authored-by: Sagar Ghuge <sagar.ghuge@intel.com>
Co-authored-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11600>
2021-06-30 16:17:18 +00:00