Commit graph

167720 commits

Author SHA1 Message Date
Ryan Neph
1d12d7c33c venus: update venus-protocol headers to fix WA1
Follow-up to previous commit, this time to fix encoding/decoding for
VkDrmFormatModifierPropertiesListEXT::drmFormatModifierCount. Fixes a
workaround (WA1) in the venus-protocol.

Signed-off-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21395>
2023-02-27 16:45:02 +00:00
Alyssa Rosenzweig
4eabd6586b nir/lower_blend: Don't dereference null
If a dual source blend colour is never written, src1 will be null and it will be
invalid to dereference it. src1 is dereferenced both for the f2fN instruction
but also if a dual blend factor is used... even if the latter isn't strictly
valid, segfaulting in the NIR pass seems a lot meaner than blending with zero.

The referenced commit hosed Asahi, causing anything that used blending to crash.
Panfrost is unaffected since it always supplies a dual colour due to our crude
construction of blend shaders.

Fixes: 8313016543 ("nir/lower_blend: Consume dual stores")
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21544>
2023-02-27 15:47:33 +00:00
Rhys Perry
75d9a4a6ce aco: always update orig_names in get_reg_phi()
No idea why this wasn't done if pc.first was a renamed temporary.

Fixes navi10 RA validation error with
dEQP-VK.binding_model.descriptor_buffer.multiple.graphics_geom_buffers1_sets3_imm_samplers

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8349
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21501>
2023-02-27 15:10:22 +00:00
Eric Engestrom
735df516e9 radv: split linker script for android since it requires different symbols
Fixes: 4956f6d0bf ("radv: Add Android module info to linker script.")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8338
Signed-off-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21518>
2023-02-27 14:34:16 +00:00
Yonggang Luo
669a68489d meson: Use sse2_arg and sse2_args to replace usage of c and c_sse2_args
And now c_sse2_arg and c_sse2_args are remvoed

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21375>
2023-02-27 13:50:11 +00:00
Yonggang Luo
ddf708a1ff meson: Split sse2_arg and sse2_args out of c_cpp_args
This is used to replace c_sse2_arg and c_sse2_args in latter commits

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21375>
2023-02-27 13:50:11 +00:00
Yonggang Luo
446630ab42 meson: When sse2 enabled, both c and cpp using sse2 options
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21375>
2023-02-27 13:50:11 +00:00
Mike Blumenkrantz
c1a62476ac vulkan/wsi/x11: make 4 image minimum for xwayland driver-specific
this avoids adding extra frames of latency to drivers that don't need
it

Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21447>
2023-02-27 13:21:21 +00:00
Mike Blumenkrantz
7c8a5f6e37 vulkan/wsi: switch to using an options struct for last param
this makes adding values easier since the drivers won't need to be updated

Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21447>
2023-02-27 13:21:21 +00:00
Georg Lehmann
1c5c2f77c3 aco: use and swizzle mask in dpp quad perm
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21412>
2023-02-27 11:09:42 +00:00
Georg Lehmann
8fabde3be4 aco/gfx11: use dpp_row_xmask and dpp_row_share
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21412>
2023-02-27 11:09:42 +00:00
Georg Lehmann
b7cd0eb439 aco: use v_permlane(x)16_b32 for masked swizzle
Should be cheaper than ds_swizzle.

Totals from 8 (0.01% of 134913) affected shaders:
CodeSize: 16316 -> 16388 (+0.44%)
Instrs: 3088 -> 3086 (-0.06%)
Latency: 49558 -> 49508 (-0.10%)
InvThroughput: 9180 -> 9198 (+0.20%)
Copies: 376 -> 384 (+2.13%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21412>
2023-02-27 11:09:42 +00:00
Georg Lehmann
9f155c21c3 amd: d16 uses rtz conversion for 32bit float
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21404>
2023-02-27 09:55:34 +00:00
Georg Lehmann
77252687fa amd: don't use d16 for integer loads
D16 saturates to min/max instead of just truncating.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21404>
2023-02-27 09:55:34 +00:00
Georg Lehmann
a00b50d820 nir: change 16bit image dest folding option to per type
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21404>
2023-02-27 09:55:34 +00:00
Samuel Pitoiset
a14d46fde2 radv: enable primitiveUnderestimation on GFX9+
It's passing dEQP-VK.rasterization.conservative.underestimate.* on
NAVI21.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21459>
2023-02-27 09:04:01 +00:00
Samuel Pitoiset
dba7a66429 radv: set MSAA_NUM_SAMPLES to 0 for underestimate rasterization
Based on PAL.

Fixes
dEQP-VK.rasterization.conservative.underestimate.samples_*.triangles.normal.test.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21459>
2023-02-27 09:04:01 +00:00
Samuel Pitoiset
0eae617826 radv: stop setting ENABLE_POSTZ_OVERRASTERIZATION to 1
According to PAL this isn't set.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21459>
2023-02-27 09:04:01 +00:00
Samuel Pitoiset
05732f4519 radv: cleanup radv_emit_{conservative,msaa}_state() functions
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21459>
2023-02-27 09:04:01 +00:00
Mike Blumenkrantz
34e7c17cfe lavapipe: EXT_image_sliced_view_of_3d
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21514>
2023-02-27 07:49:48 +00:00
Lionel Landwerlin
66e3ccbcac vulkan/runtime: store parameters of VK_EXT_sliced_view_of_3d
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21514>
2023-02-27 07:49:48 +00:00
Tatsuyuki Ishi
ed03821442 radv/sqtt: Use code buffer from radv_shader directly instead of copying.
The reference-counted radv_shader always outlives the pipeline, so we can
use this buffer directly when dumping code objects to the trace.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21513>
2023-02-27 07:16:48 +00:00
Tatsuyuki Ishi
ea070fb83a radv: Keep shader code ptr in a separately allocated buffer.
RGP traces need a dump of shader code in order to display ISA and
instruction trace. Previously, this was read back from GPU at trace
creation time. However, for future changes that implements upload shader
to invisible VRAM, the upload destination will be a temporary staging
buffer and will be only accessible during shader creation.

To allow dumping in such cases, copy the shader code to a separate buffer
at creation time, if thread tracing is enabled.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21513>
2023-02-27 07:16:48 +00:00
Mike Blumenkrantz
52f27cda05 zink: allow direct memory mapping for any COHERENT+CACHED buffer
some drivers may provide this in heaps that get used by non-staging resources,
so avoid extra copies in that case

unlike the previous attempt at this optimization, this utilizes the screen-based
context for thread-safe transfers, which should avoid races/crashes

fix #8171

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21452>
2023-02-27 03:50:14 +00:00
Mike Blumenkrantz
d78de2a962 zink: add locking for zink_screen::copy_context and defer creation
the copy context isn't always used, so this allows its creation to
be deferred and potentially save a bunch of memory

also add locking for multi-context thread safety

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21452>
2023-02-27 03:50:14 +00:00
Mike Blumenkrantz
a7b98dd4be zink: avoid adding ubo/ssbo bindings multiple times for different bitsizes
these are valid variables, but the descriptor binding needs to be unique

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21522>
2023-02-27 03:11:44 +00:00
Mike Blumenkrantz
db1af91f1b zink: simplify/rework image typing in ntv
the array approach was broken if a shader contained both bindless
and non-bindless resources, whereas a hash table is simpler and can
handle both images and samplers

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21522>
2023-02-27 03:11:44 +00:00
Mike Blumenkrantz
ac5f72a023 zink: delete unused emit_image param in ntv
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21522>
2023-02-27 03:11:44 +00:00
Mike Blumenkrantz
fb4fd03db9 zink: fix bindless texture barrier generation
whenever I redid barriers I forgot to handle bindless textures,
which meant they weren't getting barriers added

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21522>
2023-02-27 03:11:44 +00:00
Mike Blumenkrantz
75e9ba85de zink: rework descriptor unbind params to use is_compute directly
much simpler

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21522>
2023-02-27 03:11:44 +00:00
Mike Blumenkrantz
ddb116d755 zink: fix shader read access removal for barrier generation
barrier access is based on total binds per gfx/compute, not per stage

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21522>
2023-02-27 03:11:44 +00:00
Mike Blumenkrantz
00288d4f53 zink: delete dead uniform variables
this just obfuscate nir, so delete them now

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21522>
2023-02-27 03:11:44 +00:00
Alyssa Rosenzweig
760f367386 agx: Lower sampler LOD bias
G13 does not support sampler descriptor LOD biasing, so this needs to be lowered
to shader code for APIs that require this functionality. Add an option to do
this lowering while doing our other backend texture lowerings. This generates
lod_bias_agx texture instructions which the driver is expected to lower
according to its binding model.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21276>
2023-02-27 02:35:41 +00:00
Alyssa Rosenzweig
23f271833f asahi: Lower lod_bias_agx to uniform registers
Track the LOD bias of samplers and upload them at draw time to uniform
registers. This could be optimized in the future.

Vulkan will probably want to pull from a descriptor set instead.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21276>
2023-02-27 02:35:41 +00:00
Alyssa Rosenzweig
8058d31a25 nir: Add nir_texop_lod_bias_agx
Add a new texture opcode that returns the LOD bias of the sampler. This will be
used on AGX to lower sampler LOD bias to txb and friends. This needs to be a
texture op (and not a new intrinsic) to handle both bindless and bindful
samplers across GL and Vulkan in a uniform way.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21276>
2023-02-27 02:35:41 +00:00
Qiang Yu
822e756511 ac/llvm,radeonsi: lower fbfetch in abi
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21436>
2023-02-27 09:43:53 +08:00
Qiang Yu
28c2527e42 radeonsi: add num_component param to load_internal_binding
Prepare for different component number, ie. 8 when image desc.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21436>
2023-02-27 09:39:41 +08:00
Qiang Yu
5c44404b5f ac/llvm,radeonsi: lower nir_load_barycentric_at_sample in abi
RADV already did this in radv_lower_fs_intrinsics().

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21436>
2023-02-27 09:39:41 +08:00
Marek Olšák
0c8e7ad47e nir: lower to fragment_mask_fetch/load_amd with EQAA correctly
Fixes: 194add2c23 ("nir: lower image add lower_to_fragment_mask_load_amd option")
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21436>
2023-02-27 09:39:41 +08:00
Alyssa Rosenzweig
8313016543 nir/lower_blend: Consume dual stores
Now that we're working on lowered I/O, passing in the dual source blend colour
via a sideband doesn't make any sense. The primary source blend colours are
implicitly passed in as the sources of store_output intrinsics; likewise, we
should get dual source blend colours from their respective stores. And since
dual colours are only needed by blending, we can delete the stores as we go.
That means nir_lower_blend now provides an all-in-one software lowering of dual
source blending with no driver support needed! It even works for 8 dual-src
render targets, but I don't have a use case for that.

The only tricky bit here is making sure we are robust against different orders
of store_output within the exit block. In particular, if we naively lower

   x = ...
   primary color = x
   y = ...
   dual color = y

we end up emitting uses of y before it has been defined, something like

   x = ...
   primary color = blend(x, y)
   y = ...

Instead, we remove dual stores and sink blend stores to the bottom of the block,
so we end up with the correct

   x = ...
   y = ...
   primary color = blend(x, y)

lower_io_to_temporaries ensures that the stores will be in the same (exit)
block, so we don't need to sink further than that ourselves.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21426>
2023-02-26 17:35:08 -05:00
Alyssa Rosenzweig
44bdcb7214 panfrost: Use proper locations in blend shaders
Rather than always blending to FRAG_RESULT_DATA0. This removes silly special
cases in the compiler.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21426>
2023-02-26 17:35:07 -05:00
Francisco Jerez
4420251947 intel/rt: Fix L3 bank performance bottlenecks due to SW stack stride alignment.
Power-of-two SW stack sizes are prone to causing collisions in the
hashing function used by the L3 to map memory addresses to banks,
which can cause stack accesses from most DSSes to bottleneck on a
single L3 bank.  Fix it by padding the SW stack stride by a single
cacheline if it was a power of two.  This has been reported by Felix
DeGrood to improve Quake2 RTX performance by ~30% on DG2-512 in
combination with other RT patches Lionel Landwerlin has been working
on.

Many thanks to Felix DeGrood for doing much of the legwork and
providing several iterations of Q2RTX performance counter dumps which
eventually prompted me to consider the hash collision theory and
motivated this patch, and for providing additional performance counter
dumps confirming that there is no longer an appreciable imbalance in
traffic across L3 banks after this change.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21461>
2023-02-26 11:48:33 -08:00
Friedrich Vock
de4e3da4c4 docs: Fix formatting for RMV tracing docs
Fixes: e1cbff22 ("docs: Add short documentation about RMV tracing variables")
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21502>
2023-02-26 19:15:44 +00:00
David Heidelberg
be2961de09 meson: print c_cpp_args
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21539>
2023-02-26 16:55:30 +01:00
Konstantin Seurer
2d93ab795b radv/rt: Pre shift cull_mask
This removes the need for masking the instance mask.

Totals from 14 (14.43% of 97) affected shaders:
CodeSize: 378696 -> 378308 (-0.10%); split: -0.12%, +0.02%
Instrs: 70854 -> 70855 (+0.00%); split: -0.02%, +0.02%
Latency: 1651235 -> 1651215 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 336290 -> 336285 (-0.00%); split: -0.00%, +0.00%
Copies: 9915 -> 9923 (+0.08%); split: -0.03%, +0.11%
PreSGPRs: 890 -> 896 (+0.67%)

 PERCENTAGE DELTAS Shaders  CodeSize   Instrs   Latency  InvThroughput   Copies   PreSGPRs
 q2rtx-pipe        48        -0.02%    -0.02%    -0.00%      -0.00%      -0.03%      .
 q2rtx_1           49        -0.10%    +0.02%    +0.00%      +0.00%      +0.14%    +0.31%
 -------------------------------------------------------------------------------------------
 All affected      14        -0.10%    +0.00%    -0.00%      -0.00%      +0.08%    +0.67%
 -------------------------------------------------------------------------------------------
 Total             97        -0.06%    +0.00%    -0.00%      -0.00%      +0.06%    +0.16%

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21530>
2023-02-26 12:58:13 +00:00
Frank Binns
964323fe97 pvr: remove duplicate define
The same define appears a few lines above.

Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21495>
2023-02-25 22:58:25 +00:00
Frank Binns
cbfa4d52ed pvr: stop restricting the compiler to the Sascha Willems triangle demo
Do this by removing the compatibility table and only using hard coded shaders
when present. The hard coded shaders, along with the hard coding framework
itself, can be dropped once the compiler is capable of compiling the hard coded
shaders. In the meantime we don't want to risk regressing things that we know
work because we temporarily can't test them.

This restriction is being dropped now as the new compiler framework has been
merged and we want to make use of it so it can be developed further.

Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21495>
2023-02-25 22:58:25 +00:00
Giancarlo Devich
cb7403b909 d3d12: Track up to 16 active context resource states locally in d3d12_bo
After 16 entries, we fall back to the previous logic that used a hash
map to link the resource's state per context.

Preventing hash map churn by cheaply tracking up to 16 context's worth
of states per resource significantly reduces CPU cost in
find_or_create_state_entry

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21528>
2023-02-25 18:14:37 +00:00
Giancarlo Devich
2c00c069fe d3d12: Assign up to 16 simultaneously active contexts unique IDs
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21528>
2023-02-25 18:14:37 +00:00
Giancarlo Devich
bd0e1b3d02 d3d12: Move d3d12_context_state_table_entry to d3d12_resource_state.h
Also renamed desired_resource_state to d3d12_desired_resource_state,
since it's also in the header now.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21528>
2023-02-25 18:14:37 +00:00