Commit graph

141245 commits

Author SHA1 Message Date
Daniel Schürmann
b1e12747b9 aco: create VMEM clauses slightly more aggressive
Totals from 3325 (2.39% of 139391) affected shaders (NAVI10):
SGPRs: 331528 -> 331056 (-0.14%); split: -0.14%, +0.00%
VGPRs: 306164 -> 337764 (+10.32%); split: -0.02%, +10.34%
CodeSize: 38843180 -> 38865388 (+0.06%); split: -0.04%, +0.10%
MaxWaves: 18908 -> 17028 (-9.94%); split: +0.01%, -9.95%
Instrs: 7423908 -> 7427934 (+0.05%); split: -0.06%, +0.12%
Cycles: 527411756 -> 526388408 (-0.19%); split: -0.21%, +0.02%
VMEM: 1148421 -> 992660 (-13.56%); split: +0.10%, -13.67%
SMEM: 227337 -> 232380 (+2.22%); split: +2.26%, -0.04%
VClause: 146416 -> 111171 (-24.07%); split: -24.10%, +0.03%
SClause: 243674 -> 243689 (+0.01%); split: -0.00%, +0.01%
Copies: 663496 -> 660333 (-0.48%); split: -0.85%, +0.37%
Branches: 223725 -> 223721 (-0.00%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7903>
2020-12-22 15:08:40 +01:00
Daniel Schürmann
ac40301dbb aco: schedule position exports in the same pass as memory operations
No fossil-db changes.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7903>
2020-12-22 15:08:40 +01:00
Daniel Schürmann
0287ebeef3 aco: fix def-use distance calculation when scheduling.
This change also increases the VMEM_MAX_MOVES
to mitigate some of the scheduling changes.

Totals from 34301 (24.61% of 139391) affected shaders:
SGPRs: 2515440 -> 2552304 (+1.47%); split: -1.25%, +2.71%
VGPRs: 1786676 -> 1794724 (+0.45%); split: -0.31%, +0.76%
CodeSize: 151079856 -> 151209828 (+0.09%); split: -0.06%, +0.15%
MaxWaves: 392454 -> 388966 (-0.89%); split: +0.39%, -1.28%
Instrs: 28870746 -> 28895907 (+0.09%); split: -0.09%, +0.17%
Cycles: 960450680 -> 961315796 (+0.09%); split: -0.09%, +0.18%
VMEM: 19027987 -> 19796223 (+4.04%); split: +7.49%, -3.45%
SMEM: 2434691 -> 2394829 (-1.64%); split: +2.80%, -4.43%
VClause: 551776 -> 543051 (-1.58%); split: -1.73%, +0.15%
SClause: 1230147 -> 1227637 (-0.20%); split: -1.40%, +1.20%
Copies: 1957640 -> 1963617 (+0.31%); split: -1.11%, +1.41%
Branches: 611747 -> 612504 (+0.12%); split: -0.11%, +0.23%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7903>
2020-12-22 15:08:40 +01:00
Daniel Schürmann
3f14140f48 aco: allow to schedule SALU/SMEM through exec changes
Totals from 16794 (12.05% of 139391) affected shaders (NAVI10):
SGPRs: 757760 -> 762048 (+0.57%); split: -0.39%, +0.95%
VGPRs: 402844 -> 402744 (-0.02%); split: -0.04%, +0.02%
CodeSize: 22290900 -> 22285068 (-0.03%); split: -0.06%, +0.04%
MaxWaves: 294163 -> 294222 (+0.02%); split: +0.03%, -0.01%
Instrs: 4190074 -> 4188513 (-0.04%); split: -0.08%, +0.04%
Cycles: 40685028 -> 40678640 (-0.02%); split: -0.03%, +0.02%
VMEM: 7711867 -> 7704315 (-0.10%); split: +0.28%, -0.38%
SMEM: 942472 -> 1007052 (+6.85%); split: +7.15%, -0.30%
VClause: 92990 -> 92974 (-0.02%); split: -0.03%, +0.01%
SClause: 263700 -> 263810 (+0.04%); split: -0.38%, +0.42%
Copies: 277467 -> 276988 (-0.17%); split: -0.37%, +0.20%
Branches: 45899 -> 45896 (-0.01%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7903>
2020-12-22 15:08:40 +01:00
Daniel Schürmann
4a70c4d383 aco: make pred_by_exec_mask() accessible in other files
and rename to needs_exec_mask().

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7903>
2020-12-22 15:08:40 +01:00
Daniel Schürmann
2116b4504e aco: don't emit parallelcopy when switching to WQM.
The reason was an RA bug which has been fixed a while ago.
This change fixes some register demand miscalculations.

Totals from 1013 (0.73% of 139391) affected shaders (NAVI10):
CodeSize: 6050408 -> 6047504 (-0.05%); split: -0.05%, +0.00%
Instrs: 1160533 -> 1159765 (-0.07%); split: -0.07%, +0.00%
Cycles: 8027212 -> 8024140 (-0.04%); split: -0.04%, +0.00%
VMEM: 296195 -> 296091 (-0.04%)
SMEM: 73003 -> 73011 (+0.01%); split: +0.05%, -0.04%
SClause: 37221 -> 37222 (+0.00%)
Copies: 70931 -> 70166 (-1.08%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7903>
2020-12-22 15:08:40 +01:00
Mike Blumenkrantz
f815b87e18 zink: export tess shader pipe caps
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8152>
2020-12-22 13:46:38 +00:00
Mike Blumenkrantz
6324699e67 zink: handle partial writes to shader outputs
this is super gross. spirv doesn't provide any facility for doing per-component
writes, which means all components of a value must be written every time

to this end, we need to manually split both the src and dst composites and
do per-component access for each store in order to accurately handle both
non-sequential wrmasks (which could be handled by nir_lower_wrmasks, yes, but
we aren't using it) as well as partial wrmasks

see also mesa/mesa#4006

Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8152>
2020-12-22 13:46:38 +00:00
Mike Blumenkrantz
334759d850 zink: implement passthrough tcs shader injection
GL allows the pipeline to "infer" a tcs shader if a tes shader is bound using
API-specified default values for gl_TessLevelOuter and gl_TessLevelInner,
but VK requires that both shaders be explicitly present

to handle this, create a generic tcs which translates all vs outputs to
invocation-based arrays and copy the appropriate value to the expected tes
input array location. also emit the default inner/outer values as push constants
so we don't have to recompile the shaders whenever the api calls occur

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8152>
2020-12-22 13:46:38 +00:00
Mike Blumenkrantz
938b7c480e zink: add stubs for tess outer/inner level handling
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8152>
2020-12-22 13:46:38 +00:00
Mike Blumenkrantz
841d665209 zink: add push constant handling to get_storage_class()
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8152>
2020-12-22 13:46:38 +00:00
Mike Blumenkrantz
15f478fe84 zink: only run nir_lower_clip_halfz for last vertex processing stage
this lets us remove the calcs to un-convert POS during load

Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8152>
2020-12-22 13:46:38 +00:00
Mike Blumenkrantz
5b2c397c54 zink: add handling for tcs and tes shader states
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8152>
2020-12-22 13:46:38 +00:00
Mike Blumenkrantz
536520d056 zink: support PIPE_PRIM_PATCHES
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8152>
2020-12-22 13:46:38 +00:00
Mike Blumenkrantz
291bbac12c zink: set tess info in pipeline creation
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8152>
2020-12-22 13:46:38 +00:00
Mike Blumenkrantz
2891e0b74e zink: pull xfb info from tess shader when applicable
if it's the last vertex stage then it does the xfb

Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8152>
2020-12-22 13:46:38 +00:00
Mike Blumenkrantz
612d8f81c3 zink: set scoped barrier flag in nir options
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8152>
2020-12-22 13:46:38 +00:00
Mike Blumenkrantz
6ca3866056 zink: set up ntv init for tess shaders
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8152>
2020-12-22 13:46:38 +00:00
Mike Blumenkrantz
c744f079fe zink: add handling for tess shader intrinsics
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8152>
2020-12-22 13:46:38 +00:00
Mike Blumenkrantz
d09f9da4c4 zink: add ntv handling for tess shader i/o variables
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8152>
2020-12-22 13:46:38 +00:00
Mike Blumenkrantz
244310eddc zink: don't always run nir_lower_io_arrays_to_elements_no_indirects
this is automatically run for fs and vs, which is the only place we really
want it

Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8152>
2020-12-22 13:46:38 +00:00
Samuel Pitoiset
4a4ea89a99 radv: add code that checks if the extension table is sorted correctly
Ported from ANV.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8190>
2020-12-22 14:09:54 +01:00
Samuel Pitoiset
e1d1e5b7bd radv: sort the extension table like Khronos
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8190>
2020-12-22 14:09:52 +01:00
Marek Olšák
47199ee0cc cso: inline cso_construct_key
The x86 asm is a lot shorter and the loop is unrolled.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7940>
2020-12-22 12:01:38 +00:00
Marek Olšák
0d7aae7d9c cso: remove context and delete_state pointers from all CSOs
We just need them per context, not per CSO. The new delete callback
replaces the per-CSO callbacks.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7940>
2020-12-22 12:01:38 +00:00
Marek Olšák
e91c6ca5b2 st/mesa: don't make a local copy of blend color
This is perfectly safe and nothing bad can happen... and we have also CI.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7940>
2020-12-22 12:01:38 +00:00
Marek Olšák
a35014954b cso: don't pass blend_color through cso_context
It's never saved or restored. Redundant state changes are already
filtered out by mesa/main.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7940>
2020-12-22 12:01:38 +00:00
Marek Olšák
912ba743b5 gallium: inline pipe_depth_state to decrease DSA state size by 4 bytes
Depth and alpha states are now packed together, interleaved somewhat.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7940>
2020-12-22 12:01:38 +00:00
Marek Olšák
d0534cea7f gallium: inline pipe_alpha_state to enable better DSA bitfield packing
pipe_alpha_state and pipe_depth_state will be packed together
because they have only a few bitfields each. This will eventually
remove 4 bytes of padding in pipe_depth_stencil_alpha_state.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7940>
2020-12-22 12:01:38 +00:00
Marek Olšák
b7f12a0452 gallium: pass pipe_stencil_ref by value (it has only 2 bytes)
This changes pipe_context::set_stencil_ref to pass the parameter by value.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7940>
2020-12-22 12:01:38 +00:00
Samuel Pitoiset
2d87e52b37 radv: enable VK_EXT_line_rasterization on GFX9
It was disabled because some CTS failed but they pass now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8189>
2020-12-22 09:25:48 +01:00
Hyunjun Ko
ec1464077b turnip: use ir3_compiler_destroy instead of ralloc_free
Fixes: c0f22c3d94 "freedreno/ir3: add ir3_compiler_destroy()"

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6808>
2020-12-22 04:57:22 +00:00
Hyunjun Ko
19a7a915ca turnip/kgsl: support VK_KHR_performance_query
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6808>
2020-12-22 04:57:22 +00:00
Hyunjun Ko
3d90909837 turnip: enable VK_KHR_performance_query with new debug flag
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6808>
2020-12-22 04:57:22 +00:00
Hyunjun Ko
c921a6e98d turnip: support multipass for performance query.
To support multipass, querying perf counters happens in several steps
below.

0) There's a scratch reg to set pass indices for perf counters query.
   Prepare cmd streams to set each pass index to the reg at device
   creation time. See tu_CreateDevice in tu_device.c
1) Emit command streams to read all requested perf counters at all
   passes in begin/end query with CP_REG_TEST/CP_COND_REG_EXEC, which
   reads the scratch reg where pass index is set.
2) Pick the right cs setting proper pass index to the reg and prepend it
   to the command buffer at each submit time.
3) If the pass index in the reg is true, then executes the command
   stream below CP_COND_REG_EXEC.

Would need to implement for kgsl in the future.

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6808>
2020-12-22 04:57:22 +00:00
Hyunjun Ko
937dd76426 turnip: Implement VK_KHR_performance_query
There are still some commands unimplemented yet.

- vkGetPhysicalDeviceQueueFamilyPerformanceQueryPassesKHR:
  The following patch supports this.

- vkAcquireProfilingLockKHR / vkReleaseProfilingLock
  This patch supports only monitoring perf counters for each submit.
  To reserve/configure counters across submits we would need a kernel
  interface to be able to do that.

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6808>
2020-12-22 04:57:22 +00:00
Icecream95
a250f3620c panfrost: Fix panfrost_small_padded_vertex_count for 17 vertices
All odd numbers above 10 need to be rounded up to an even number, so
add one and mask off the least significant bit instead of maintaining
a list of special cases.

Fixes crashes in SuperTuxKart.

Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8191>
2020-12-22 01:54:24 +00:00
Icecream95
fdcb03c2d7 panfrost: Expose ARB_texture_filter_anisotropic on supported GPUs
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8149>
2020-12-21 22:49:28 +00:00
Icecream95
48c676c501 panfrost: Add a gpu_revision argument to panfrost_get_quirks
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8149>
2020-12-21 22:49:28 +00:00
Icecream95
0322653b71 panfrost: Set the anisotropy level when cso->max_anisotropy is set
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8149>
2020-12-21 22:49:28 +00:00
Icecream95
601dfd0093 panfrost: Fix the Maximum anisotropy field in the XML
It needs a a minus(1) modifier.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8149>
2020-12-21 22:49:28 +00:00
Alyssa Rosenzweig
9c042b6976 panfrost: Fix LOD mode field on Bifrost
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8149>
2020-12-21 22:49:28 +00:00
Alyssa Rosenzweig
8db0775f45 pan/bi: Minor styling cleanup in disasm
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8139>
2020-12-21 21:54:50 +00:00
Alyssa Rosenzweig
15558873f4 pan/bi: Remove all-0's termination condition
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8139>
2020-12-21 21:54:50 +00:00
Alyssa Rosenzweig
b18855a0a1 pan/bi: Space out disassembly
Makes things much more legible with some "room to breathe".

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8139>
2020-12-21 21:54:50 +00:00
Alyssa Rosenzweig
3071f36cfb pan/bi: Allow toggling disassembly verbosity
Verbose mode is especially useful for debugging packing, but otherwise
just gets in the way I find.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8139>
2020-12-21 21:54:50 +00:00
Yevhenii Kolesnikov
5ad54d498c intel/fs: don't spill a register, set by undef
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3941
Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8185>
2020-12-21 21:18:01 +00:00
Eric Anholt
bb4ade40e4 lvp: Fix vtn warnings about unsupported image read/write without format.
These are just warnings printed to the console and don't affect testcase
pass/fail, but clog up the deqp-runner job logs.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8188>
2020-12-21 20:33:35 +00:00
Eric Anholt
463dbbffa8 ci/deqp: Make sure that we pull in all board-specific xfail/skip/flake files.
When introducing/removing these files, it's easy to forget to update the
yml to point to them.  Instead of requiring the separate update, just have
the runner script pick the right one from a single per-gpu variable.

As a result, we now pick up the new deqp-lvp-skips.txt that was added but
not conected.  This also required moving some bypass flakes from the
shared a630 flakes list to a separate list, which is a feature because now
we'd notice the introduction of flakes to the gmem path.

Fixes: ab79e6b8e3 ("ci: skip failing test on lavapipe")
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8147>
2020-12-21 18:44:43 +00:00
Bas Nieuwenhuizen
9339ed2f85 radv: Enable DCC in the GENERAL layout on GFX10+.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7004>
2020-12-21 18:32:24 +00:00