Commit graph

151646 commits

Author SHA1 Message Date
Dave Airlie
90a6947632 u_transfer: refactor out code to check interleave/deinterleave path.
The checks were reproduced making adding another one not so fun.

rework the deinterleave path code to match the interleave path code.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15516>
2022-03-26 01:22:15 +00:00
Dave Airlie
783cab811d util/format: add new z24/s8 packing helper to pack z32/s8.
If zink runs on top of a vulkan impl with no 24-bit float support
it needs support to pack into 24-bit for GL.

To avoid having to make a temp copy, add a new helper to convert
and pack.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15516>
2022-03-26 01:22:15 +00:00
Kenneth Graunke
823745dc27 intel/compiler: Use nir_opt_uniform_atomics()
In general, an atomic intrinsic may perform separate atomics for every
enabled SIMD channel, as each channel may operate on different memory.

However, an extremely common case is for all channels to access the same
memory location.  In this case, we can simply perform a reduction/scan
across the subgroup, and perform one atomic for the whole subgroup,
rather than one per channel.  For example, if an intrinsic says to take
the minimum value of the existing memory and the value in each channel,
we can do a thread-local minimum of all enabled channels, then do a
single atomic to take the minimum of that and the existing memory.

Our hardware doesn't optimize the case where multiple channels ask for
atomics on the same memory location; it assumes the compiler will do so.

nir_opt_uniform_atomics() uses divergence analysis to detect this case,
adds the necessary subgroup operations, and moves the atomic inside a
conditional that disables all but a single invocation.  It even detects
cases where the shader code already performs this kind of optimization,
and avoids doing it a second time.

This may not be the optimal solution for us.  In the backend, we could
detect this case and emit send(1) instructions with NoMask, rather than
generating if...send(16)...endif, and a lot of unnecessary ALU ops.  But
it's simple to do, reuses the same path as ACO, and still provides most
of the benefit by cutting up to 16x atomics down to a single atomic,
which is more merciful to the memory bus.

Improves performance of Shadow of the Tomb Raider by 5.5% on XeHP.

Improves performance of a customer-internal benchmark on XeHP at
3840x2160 and low settings by approximately 30%.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15484>
2022-03-26 00:28:19 +00:00
Kenneth Graunke
49ef23f4a6 intel/compiler: Convert to LCSSA and use divergence analysis.
We'll use this more shortly.  For now, enable it to separately in case
anything bisects to this.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15484>
2022-03-26 00:28:19 +00:00
Kenneth Graunke
b3942beecf intel/compiler: Set divergence analysis options
Although we don't use divergence analysis yet, we've had several
work-in-progress series that make use of it.  We may as well set
our options so that those series can assume they're in place.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15484>
2022-03-26 00:28:19 +00:00
Kenneth Graunke
6fa66ac228 intel/compiler: Implement nir_intrinsic_last_invocation
We haven't exposed this intrinsic as it doesn't directly correspond to
anything in SPIR-V.  However, it's used internally by some NIR passes,
namely nir_opt_uniform_atomics().

We reuse most of the infrastructure in brw_find_live_channel, but with
LZD/ADD instead of FBL.  A new SHADER_OPCODE_FIND_LAST_LIVE_CHANNEL is
like SHADER_OPCODE_FIND_LIVE_CHANNEL but from the other side.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15484>
2022-03-26 00:28:19 +00:00
Kenneth Graunke
af529b545a nir: Teach nir_divergence_analysis about Intel-specific intrinsics
- load_reloc_const is just an immediate constant load, it's convergent.

- nir_intrinsic_load_global_const_block_intel should be convergent,
  it says the address must be uniform, and we uniformize the predicate

- Lowered image intrinsics: image_deref_load_param_intel just reads
  information about an image, as long as the image variable is
  convergent it should be too.  load_raw_intel...if the address we
  come up with is convergent, it ought to be as well.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15484>
2022-03-26 00:28:19 +00:00
Mike Blumenkrantz
3e9bd67f23 zink: add another radv flake
literally no idea

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15589>
2022-03-26 00:06:47 +00:00
Caio Oliveira
c32d386ce2 intel/compiler: Inline TUE map computation into TUE Input lowering
Refactor since the TUE compute function is simpler now and the
comments make sense being near the lowering.

Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15022>
2022-03-25 23:29:19 +00:00
Caio Oliveira
c36ae42e4c intel/compiler: Use nir_var_mem_task_payload
Instead of reusing the in/out slot mechanism, use a separated NIR
variable mode.  This will make easier later to implement staging the
output in shared memory (and storing all at the end to the URB).

Note to get 64-bit type support we currently rely on the
brw_nir_lower_mem_access_bit_sizes() pass.

Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15022>
2022-03-25 23:29:19 +00:00
Daniel Schürmann
2d1e6b756e aco: remove 'high' parameter from can_use_opsel()
No fossil-db changes.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15551>
2022-03-25 22:02:50 +00:00
Daniel Schürmann
b98a9dcc36 aco/optimizer: fix call to can_use_opsel() in apply_insert()
The definition index is -1.

Fixes: 54292e99c7 ('aco: optimize 32-bit extracts and inserts using SDWA ')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15551>
2022-03-25 22:02:50 +00:00
Adam Jackson
8006179cfd wsi/x11: xcb_wait_for_special_event failure is an error
The only ways that function can return NULL are:

- the xcb connection was closed
- the window for the swapchain was destroyed
- the special event listener was unregistered from another thread
- malloc failure

All of these are permanent errors, the swapchain is no longer in a
usable state, so we should treat this as VK_ERROR_SURFACE_LOST_KHR.

Acked-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15558>
2022-03-25 19:31:13 +00:00
Alyssa Rosenzweig
f31208f778 pan/va: Lower BLEND to call blend shaders
Do this as late as possible.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
2022-03-25 19:00:13 +00:00
Alyssa Rosenzweig
cb76cc1f1d pan/va: Add packing unit tests
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
2022-03-25 19:00:13 +00:00
Alyssa Rosenzweig
18bf478f1e pan/va: Add shader-db support
Reports the common subset from Bifrost, as well as Mali offline compiler
style normalized cycle counts.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
2022-03-25 19:00:13 +00:00
Alyssa Rosenzweig
8bc268f2d5 pan/va: Implement the cycle model
Will feed into shader-db reporting, and maybe other things eventually.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
2022-03-25 19:00:13 +00:00
Alyssa Rosenzweig
8a258a685c pan/va: Test instruction selection lowerings
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
2022-03-25 19:00:13 +00:00
Alyssa Rosenzweig
1745c89312 pan/va: Lower branch offsets
Logic is lifted from bi_layout.c, adapted to work on instructions (not
clauses) and for Valhall's off-by-one semantic which is annoyingly
different than Bifrost. (But the same as Midgard -- Bifrost was
annoyingly different than Midgard!)

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
2022-03-25 19:00:13 +00:00
Alyssa Rosenzweig
9a9b20e652 pan/va: Add instruction selection lowering pass
Valhall removes certain instructions from Bifrost, requiring a canonical
lowering.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
2022-03-25 19:00:13 +00:00
Alyssa Rosenzweig
b796d32564 pan/va: Add constant lowering pass
Valhall has a lookup table for common constants. Add a pass to take
advantage of it, lowering away immediate indices.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
2022-03-25 19:00:13 +00:00
Alyssa Rosenzweig
b8f912e547 pan/va: Validate FAU before packing
These are pre-conditions required for packing.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
2022-03-25 19:00:13 +00:00
Alyssa Rosenzweig
fd1906afea pan/va: Add FAU validation
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
2022-03-25 19:00:13 +00:00
Alyssa Rosenzweig
676d9c9441 pan/va: Add unit tests for ADD_IMM optimizations
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
2022-03-25 19:00:13 +00:00
Alyssa Rosenzweig
13d7ca1300 pan/va: Optimize add with imm to ADD_IMM
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
2022-03-25 19:00:13 +00:00
Alyssa Rosenzweig
f45654af59 pan/va: Add packing routines
Mostly manual since Valhall is regular.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
2022-03-25 19:00:13 +00:00
Alyssa Rosenzweig
edf284215d pan/va: Add helpers for swapping bitwise sources
Annoyingly different from Bifrost.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
2022-03-25 19:00:13 +00:00
Alyssa Rosenzweig
619566dea1 pan/va: Generate header containing enums
We already collect enums in the ISA description XML. Export them for use in the
compiler backend, particularly the packing code.

Usually we'd use Mako for templating. In this case, the script is so trivial a
template engine didn't seem worth it. (The obvious version with Mako was about
10 lines longer than just prints and f-strings used here.)

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Suggested-by: Icecream95 <ixn@disroot.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
2022-03-25 19:00:13 +00:00
Alyssa Rosenzweig
7ad98ae96e pan/va: Build opcode info structures
Filled out the new structures from XML.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
2022-03-25 19:00:13 +00:00
Alyssa Rosenzweig
40ed485e32 pan/va: Permit encoding more flags
Missed the first time around.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
2022-03-25 19:00:13 +00:00
Alyssa Rosenzweig
76487c7eb4 pan/va: Unify flow control
Group together dependency waits and flow control into a single enum. This
simplifies the code, clarifies some detail, and ensures consistency moving
forward.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
2022-03-25 19:00:13 +00:00
Alyssa Rosenzweig
cf6d1a81f6 pan/va: Add Bifrost-style LD_VAR instructions
For use in the legacy non-MALLOC_IDVS flow. Especially useful in blit shaders.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
2022-03-25 19:00:13 +00:00
Alyssa Rosenzweig
295b802f64 pan/va: Add LD_VAR_BUF instructions
Like LD_VAR_BUF_IMM but indirect.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
2022-03-25 19:00:13 +00:00
Alyssa Rosenzweig
e8590e0d04 pan/va: Add ST_TILE instruction
Encoded like LD_TILE, required for some MSAA blend shaders.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
2022-03-25 19:00:13 +00:00
Alyssa Rosenzweig
fa841273d4 pan/bi: Rename I->action to I->flow
For consistency with the Valhall ISA.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
2022-03-25 19:00:13 +00:00
Alyssa Rosenzweig
f5585700be pan/bi: Model LD_VAR_BUF instructions
These are indirect versions of LD_VAR_BUF_IMM, taking their index in bytes. Used
for indirect varying loads (the NIR lowering is inefficient).

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
2022-03-25 19:00:13 +00:00
Alyssa Rosenzweig
97a13d6424 pan/bi: Augment ST_TILE with register format
To model its Valhall incarnation.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
2022-03-25 19:00:13 +00:00
Alyssa Rosenzweig
c7f6b973b2 pan/bi: Check return addresses in blend shaders
Required on Valhall, where jumping to 0x0 doesn't automatically terminate the
program. Luckily the check is free there too.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
2022-03-25 19:00:13 +00:00
Alyssa Rosenzweig
1b7d7ebbab pan/bi: Allow branch_offset on BLEND
Required to model BLEND accurately on Valhall, where it encodes a special
relative branch... Midgard style!

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
2022-03-25 19:00:13 +00:00
Alyssa Rosenzweig
cfde0275e4 pan/bi: Model Valhall-style A(CMP)XCHG
Handled consistently with computational atomics.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
2022-03-25 19:00:13 +00:00
Alyssa Rosenzweig
90867e8204 pan/bi: Add ATOM_RETURN pseudo-instruction
Allows modeling Valhall's atomics better.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
2022-03-25 19:00:13 +00:00
Alyssa Rosenzweig
7983a0d0dc pan/bi: Rename PATOM_C to ATOM
This is basically what's native on Valhall. Use the Valhall naming for the
pseudo-instruction on Bifrost for consistency.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
2022-03-25 19:00:13 +00:00
Alyssa Rosenzweig
b70a7c97bb pan/bi: Gate late DCE/CSE on "optimize"
Otherwise we can end up with unlowered ATOM.i32 on Valhall.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
2022-03-25 19:00:13 +00:00
Alyssa Rosenzweig
3485b8dc78 pan/bi: Use consistent modifier lists in packing
If there are modifiers only used by pseudo instructions, not the real
instructions, bi_packer can get out-of-sync with bi_opcodes, causing
hard-to-debug issues. Do the stupid-simple thing to ensure this doesn't happen.

This may be a temporary issue, depending whether ISA.xml and the IR get split
out for better Valhall support.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
2022-03-25 19:00:13 +00:00
Emma Anholt
b1fa2068b8 nouveau/nir: Enable nir_opt_move/sink.
NIR load_consts/inputs tend to happen together at the top of the program.
In the TGSI backend the loads got emitted at use time, while the NIR
backend was emitting the loads at load intrinsic time.  By sinking the
intrinsics, we can greatly reduce register pressure.

nv92 NIR results:

total local in shared programs: 2024 -> 2020 (-0.20%)
local in affected programs: 4 -> 0
total gpr in shared programs: 790424 -> 735455 (-6.95%)
gpr in affected programs: 215968 -> 160999 (-25.45%)
total instructions in shared programs: 6058339 -> 6051208 (-0.12%)
instructions in affected programs: 410795 -> 403664 (-1.74%)
total bytes in shared programs: 41820104 -> 41660304 (-0.38%)
bytes in affected programs: 7147296 -> 6987496 (-2.24%)

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15542>
2022-03-25 18:34:39 +00:00
Rajnesh Kanwal
df1c7ca0e5 pvr: Use vk_common_GetDeviceQueue API.
Removes pvr_GetDeviceQueue implementation. As we are now
using the common vk_queue structure as a base for pvr_queue,
we can use vk_common_GetDeviceQueue implementation instead.

Signed-off-by: Rajnesh Kanwal <rajnesh.kanwal@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15574>
2022-03-25 18:07:08 +00:00
Boris Brezillon
a3bdad314e dzn: Compile-test the driver
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14766>
2022-03-25 16:21:45 +00:00
Erik Faye-Lund
a012b21964 microsoft: Initial vulkan-on-12 driver
This is Dozen, the Vulkan on DirectX 12 driver. Not to be confused with
DirectEggs.

This is an early prototype, and not meant to be upstreamed as-is.

Co-Authored-by: Boris Brezillon <boris.brezillon@collabora.com>
Co-Authored-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Co-Authored-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Co-Authored-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14766>
2022-03-25 16:21:45 +00:00
Enrico Galli
6635d011cb microsoft/spirv_to_dxil: Add missing ralloc_free
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14766>
2022-03-25 16:21:45 +00:00
Boris Brezillon
7e0ab29cfd vulkan/util: Make STACK_ARRAY() work for arrays of pointers
And add an explicit cast on the pointer returned by malloc() to
make C++ happy.

Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14766>
2022-03-25 16:21:45 +00:00