Commit graph

3698 commits

Author SHA1 Message Date
Emma Anholt
26abdef5bc turnip: Be sure we blit depth, not stencil, for Z32FS8 -> Z32F resolves.
Fixes: #7143
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19181>
2022-10-20 18:20:00 +00:00
Mark Collins
029d4cbf42 tu: Clean up variable usage in tu6_draw_common
Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19107>
2022-10-19 19:00:42 +00:00
Mark Collins
9248ce2978 tu: Only write A6XX_PC_PRIMITIVE_CNTL_0 if changed
Increases the score in the `draw` test in `vkoverhead` to 71809
from 67170 on a HDK 888.

Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19107>
2022-10-19 19:00:42 +00:00
Rob Clark
2ad637f52a freedreno/a6xx: Update caps
We should be doing all the 64b lowering.. I think that should be enough
to get us at least glsl400.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19100>
2022-10-19 12:23:40 +00:00
Rob Clark
dc70a940d4 freedreno/a6xx: Fix primitives-generated query
RBBM_PRIMCTR_7 is pre-clipped, whereas RBBM_PRIMCTR_8 is after clipping.
I believe we want pre-clipping, and this is what tu does.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19100>
2022-10-19 12:23:40 +00:00
Rob Clark
f26631c6de freedreno/a6xx: Fix MAX_GEOMETRY_OUTPUT_VERTICES cap
Limited by the size of PC_PRIMITIVE_CNTL_5.GS_VERTICES_OUT

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19100>
2022-10-19 12:23:40 +00:00
Rob Clark
b96e8050d6 freedreno/ir3: Lower all the 64b
Just need to enable some existing lowering.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19100>
2022-10-19 12:23:40 +00:00
Rob Clark
1b38d233fc freedreno/ir3: Fix clipvertex with GS+tess
If we have both GS and tess, GS is the stage we should run lower_clip_vs
on.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19100>
2022-10-19 12:23:40 +00:00
Mark Collins
09ae2c4fee tu: Optimize hash_renderpass_instance by removing XXH64_update
It was determined through testing that `XXH64_update` is
significantly slower than calling `XXH64` directly as far as small
data velocity is concerned. This function is called on every RP end
which made it visible while profiling but substantial difference
(measured to be ~4x) made it not show up whatsoever.

Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18428>
2022-10-18 16:28:29 +00:00
Danylo Piliaiev
3eed5931ed tu: Fix the size of patch control points state
tu6_emit_patch_control_points was called with CS size calculated
at compile time, but HS params have dynamic size. Account for this.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7479

Fixes: 68f3c38c80
("tu: Implement extendedDynamicState2PatchControlPoints")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19075>
2022-10-17 20:13:43 +00:00
Yonggang Luo
44ccaca41d util/mesa/wide: Rename _SIMPLE_MTX_INITIALIZER_NP to SIMPLE_MTX_INITIALIZER
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18393>
2022-10-14 03:27:41 +00:00
Emma Anholt
8721323100 turnip: Add perf debug for more UBWC-disable cases that we could support.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18990>
2022-10-11 19:10:18 +00:00
Emma Anholt
c425b7342e turnip: Add perf_debug for UBWC being disabled due to mutable formats.
I suspect this is going to be a popular perf issue for zink and angle.  I
keep having to print out format lists for debug.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18990>
2022-10-11 19:10:18 +00:00
Emma Anholt
29488c4183 turnip: Move the ubwc_possible check before mutable formats.
I'm going to add some perf debug about mutable formats, and I don't want
to warn when UBWC would be impossible anyway.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18990>
2022-10-11 19:10:18 +00:00
Emma Anholt
4fe3330765 turnip: Add a perf_debug for feedback-related performance traps.
This can show up in layering drivers.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18990>
2022-10-11 19:10:18 +00:00
Connor Abbott
9b1087ca7c tu: Add compute shader instrlen workaround
It's a bit unfortunate that this doesn't match any blob workaround that
we know of, but it seems to be necessary.

Closes: #5892
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19023>
2022-10-10 21:04:18 +00:00
Connor Abbott
0dd60610dc freedreno: Add LABEL flush
This seems like a debug thing, but the blob also seems to use it for
workarounds where an event is required but no actual work needs to be
done. For example CP_REG_WRITE uses it for various workarounds.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19023>
2022-10-10 21:04:18 +00:00
Danylo Piliaiev
a1c372cd84 tu: Reset whether there is DS resolve for dynamic subpass
Otherwise we use old invalid value.

Relevant CTS tests:
 dEQP-VK.pipeline.monolithic.multisample.misc.dynamic_rendering.multi_renderpass.r8g8b8a8_unorm_r16g16b16a16_sfloat_r16g16b16a16_*

Fixes: ed125e6cca
("tu: Initial support for dynamic rendering")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18999>
2022-10-10 20:45:51 +00:00
Danylo Piliaiev
4eba6d71a8 tu: Lazily init VSC to fix dynamic rendering in secondary cmdbufs
Dynamic renderpasses need vsc_prim_strm_pitch, vsc_draw_strm_pitch
values, and a correct BO. The easiest way to solve this is to
lazily init VSC when it is needed, and not at every cmdbuf
initialization.

Fixes CTS tests (when running with TU_DEBUG=gmem,forcebin):
 dEQP-VK.draw.dynamic_rendering.complete_secondary_cmd_buff.*

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18996>
2022-10-10 18:31:15 +00:00
Danylo Piliaiev
e70a2148e5 tu: Do not DCE unused output vars used for transform feedback
Fixes CTS tests:
 dEQP-VK.transform_feedback.simple.multiquery_omit_write_1
 dEQP-VK.transform_feedback.simple.multiquery_omit_write_3

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19020>
2022-10-10 18:12:04 +00:00
Rob Clark
f6f72b5629 freedreno/drm: Don't call kernel with no ops
When called with FD_BO_PREP_FLUSH as the only op bit set, the intention
is to only sync with the submit-queue.. we shouldn't be calling down to
the kernel (where op==0 gets interpreted as MSM_PREP_READ).

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18926>
2022-10-08 14:25:29 +00:00
Emma Anholt
dadb29cf2e turnip: Don't use the dynamic color write enable during non-dynamic.
We have the correct merged color write enable state as a local var here,
use that instead of the zero cmd->state.color_write_enable.  Fixes
blending in many traces with ANGLE on turnip.  In the process of fixing,
clarify the logic a little bit.

Fixes: 169e03800d ("tu: Implement VK_EXT_color_write_enable")
Fixes: #7328
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18956>
2022-10-04 20:50:51 +00:00
Connor Abbott
68f3c38c80 tu: Implement extendedDynamicState2PatchControlPoints
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18773>
2022-10-04 15:39:43 +00:00
Connor Abbott
1bd3d28050 tu: DS primitive stride does not use patch control points
Previously we would use patch control points if there was no GS, but
it wasn't immediately obvious that this driver param is unused if there
is no GS. Make it output 0 instead, making it clear that we can emit it
even if we don't know the patch control points. This change in the
cmdstream is split out from the next commit.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18773>
2022-10-04 15:39:43 +00:00
Connor Abbott
042c135a99 tu: Fix param_stride placement
Even though it's tessellation-related, it's set based on the
tessellation variant which is only known after linking. The param stride
may change due to LTO if fast linking is not used.

Fixes: e9f5de11d4 ("tu: Initial implementation of VK_EXT_graphics_pipeline_library")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18773>
2022-10-04 15:39:43 +00:00
Connor Abbott
66b9c05bb9 ir3: Add missing cat5 encoding to asm parser
We were missing the case where there is a sampler and texture but the
texture offset is encoded in a1.x.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18840>
2022-10-04 14:00:50 +00:00
Connor Abbott
dcab399a17 ir3/analyze_ubo_ranges: Account for reserved consts better
It turns out that the ir3_setup_const_state() already includes reserved
consts, so we were accidentally counting it twice. This makes us use
less consts, and if there are enough reserved consts can make it go
negative and wrap around. Fix this while also making sure the previous
bug remains fixed.

Fixes: 8cb1deded6 ("ir3/analyze_ubo_ranges: Account for reserved consts")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18840>
2022-10-04 14:00:50 +00:00
Connor Abbott
c58d633dd2 ir3: Move fixup_regfootprint() to ir3_collect_info()
This fixes the case where fixup_regfootprint() adds to the reg footprint
but it isn't accounted for when determining whether we should double
threadsize in ir3_collect_info(). This would produce a hang on a650 and
above where we have a reg footprint of 33 and doubled threadsize.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18840>
2022-10-04 14:00:50 +00:00
Connor Abbott
7d1b8c8ab2 ir3: Delete outputs from fixup_regfootprint()
We weren't considering the number of components, which means that we
would overestimate the output size, which could result in nonsensical
things like a reg footprint of larger than r48.x. In addition, in some
cases we can force double regsize which would go badly if this
miscalculated the reg footprint, although currently this only happens
with compute shaders where there are no outputs. It's not actually
necessary anyway, because any output must come from an input or
something in the shader - this is how RA works. Just delete it.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18840>
2022-10-04 14:00:50 +00:00
Eric Anholt
1a286837bc freedreno/ir3: Validate our scheduling DAGs after construction.
This gives us some better explanation of a stack overflow in ir3_postsched
with shader-db:

IR3_SHADER_DEBUG=nouboopt ./run shaders/nexuiz/46.shader_test

DAG validation failed at:
  0x55f6570e8460: 0079:0107:002: 	_meta:collect r1.w (wrmask=0xff), r1.w, r2.x, r2.y, r2.z, r2.w, r3.x, r3.y, r3.z, false-deps:_[0098:0126:000:  mov.u32u32], _[0112:0143:000:  mov.u32u32], _[0087:0113:000:  mov.u32u32], _[0113:0144:000:  mov.u32u32], _[0099:0127:000:  mov.u32u32], _[0088:0114:000:  mov.u32u32]

Nodes in stack:
  0x55f657102050: 0079:0103:009: 	mov.u32u32 r1.w, r0.x, right=_[0080:0104:009:  mov.u32u32]

  0x55f6570e8460: 0079:0107:002: 	_meta:collect r1.w (wrmask=0xff), r1.w, r2.x, r2.y, r2.z, r2.w, r3.x, r3.y, r3.z, false-deps:_[0098:0126:000:  mov.u32u32], _[0112:0143:000:  mov.u32u32], _[0087:0113:000:  mov.u32u32], _[0113:0144:000:  mov.u32u32], _[0099:0127:000:  mov.u32u32], _[0088:0114:000:  mov.u32u32]

  0x55f657075f80: 0083:0108:007: 	samgq (f32)(xyz)r0.z (wrmask=0x7), r1.w (wrmask=0xff), s#3, t#3

  0x55f657051b60: 0104:0134:008: 	ldc.offset0 r3.x (wrmask=0xf), imm[0.000000,0,0x0], r9.w

  0x55f657103040: 0112:0143:000: 	mov.u32u32 r9.w, r0.x, right=_[0113:0144:000:  mov.u32u32]

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6656>
2022-09-29 23:40:18 +00:00
Connor Abbott
1ca8930845 tu: Fix setting RB_DEPTH_CNTL::Z_CLAMP_ENABLE
I missed this when enabling pipeline libraries, and we were also setting
this to the wrong thing. Previously we were using rasterization state
when parsing depth/stencil indirectly via builder->depth_clip_disable,
which is not allowed with pipeline libraries. Fixing this is a bit
painful because now RB_DEPTH_CNTL can depend on state from both the
fragment shader library and the pre-rasterization library, in addition
to being disabled via output interface state.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18861>
2022-09-29 10:47:52 +00:00
Connor Abbott
0b131b3e99 freedreno/a6xx, tu: GRAS_CL_CNTL::UNK5 is Z_CLAMP_ENABLE
This changes the behavior for freedreno but it should ultimately be the
same for GL/GLES, given what mesa/st does.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18861>
2022-09-29 10:47:52 +00:00
Connor Abbott
5af6dad179 Revert "freedreno,ir3: rename Z_CLAMP_ENABLE to Z_CLIP_DISABLE"
This reverts commit 6cb41c5188. It was
incorrect and the issue it was trying to fix was actually a zink bug.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18861>
2022-09-29 10:47:52 +00:00
Joshua Ashton
0f770caa23 freedreno: Disable 8bpp_ubwc on a6xx gen2
Fixes text corruption in VSCode on a680.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18779>
2022-09-23 16:08:33 +00:00
Erik Faye-Lund
21ec469a2f zink: emulate latc formats with rgtc
util_format_luminance_to_red returns PIPE_FORMAT_NONE for LATC formats,
because there's no red-alpha variant of it, only red-green.

Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18596>
2022-09-23 10:55:15 +00:00
Connor Abbott
8cb1deded6 ir3/analyze_ubo_ranges: Account for reserved consts
We weren't accounting for the reserved consts when calculating how much
we can upload. This led to assertion failures later if we pushed too
much.

Fixes: d3b7681df2 ("tu: ir3: Emit push constants directly")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18757>
2022-09-22 22:16:22 +00:00
Connor Abbott
750ecb0aa9 tu: Set textures_used for input attachments correctly
We were accidentally multiplying by 2 twice. Noticed by inspection.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18757>
2022-09-22 22:16:22 +00:00
Connor Abbott
f483419c23 tu: Fix maxPerStageDescriptorUpdateAfterBindInputAttachments
We need this to be the same as maxPerStageDescriptorInputAttachments.

Fixes: d9fcf5de55 ("turnip: Enable nonuniform descriptor indexing")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18757>
2022-09-22 22:16:21 +00:00
Emma Anholt
5e39b52e6a turnip: Fix busy-waiting on syncobjs with OS_TIMEOUT_INFINITE.
I noticed that glmark2's glFinish()es in its offscreen rendering tests
under zink were spinning.  When we passed -1 as the timeout for
drmSyncobjWait(), the kernel would immediately return ETIME.

Fixes: 0a82a26a18 ("turnip: Porting to common implementation for timeline semaphore")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18739>
2022-09-22 19:24:00 +00:00
Chia-I Wu
79208d8bf3 turnip: advertise VkExternalFenceProperties correctly
Remove tu_GetPhysicalDeviceExternalFenceProperties and let the common
entrypoint does the work.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18711>
2022-09-21 20:55:41 +00:00
Emma Anholt
112b8d7c4d ci/zink+turnip: Add a manual full run of the dEQP CTS.
We don't have enough spare boards to run this by default, but it's
catching interesting bugs and we want to be able to look at its status for
evaluating zink usage.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18717>
2022-09-21 19:57:07 +00:00
Emma Anholt
64d0e94d2c turnip: Use the simplified stencil write flags for the LRZ-allowed check.
Traces of GLES games that ANGLE has taken frequently have no-op stencil
writes, which ANGLE and Zink both pass straight through.  Given that we
support dynamic stencil state updates via tu_CmdSetStencil*(), draw time
really is the time for deciding this state unfortunately.

Reuse the fancier stencil write enables check from "can we do early z?" in
"can we do LRZ?".  This gets one set of draws in among_us to have LRZ, but
I don't see a detectable performance difference.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18691>
2022-09-21 17:18:07 +00:00
Emma Anholt
b9f9bfa556 turnip: Fix the "written stencil is unmodified" check.
We want to know if anything writes stencil, not if all of them do.

Fixes: b2a60c157e ("turnip: add LRZ early-z support")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18691>
2022-09-21 17:18:07 +00:00
Danylo Piliaiev
075cd3ca94 tu: Expose Vulkan 1.3
We have all required functionality implemented, and DXVK now requires
Vulkan 1.3.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18709>
2022-09-21 11:56:26 +00:00
Connor Abbott
e9f5de11d4 tu: Initial implementation of VK_EXT_graphics_pipeline_library
Now that the state for each pipeline is split into pieces, we can mostly
implement it by stitching together the pieces. One TODO is that we could
do more to split up the pre-rast and FS commands into separate draw
states so that we have to emit less commands when fast linking,
currently we compile the variants but delay emitting the commands until
link time, but note that even the Gallium driver doesn't currently do
this. Given the strict SSO model (e.g. with separate VPC registers for
each stage) it may even be possible to do most of the linking ahead of
time with only a few fixups for corner cases.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18554>
2022-09-21 11:20:15 +00:00
Connor Abbott
0a47002a65 tu: Abstract driver-specific const state
Right now, we pass around the push constant state in a lot of places,
but we'll want to add other driver-managed constants. Add a struct which
we can add to, and separate out the total driver-reserved constants from
the size of push constants.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18554>
2022-09-21 11:20:15 +00:00
Connor Abbott
29262f3337 tu: Use vk_pipeline_shader_stage_to_nir
This will be necessary for graphics pipeline libraries where pipeline
stages can have the SPIR-V inline.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18554>
2022-09-21 11:20:15 +00:00
Connor Abbott
46b2c62947 tu: Split up prim order computation
With pipeline libraries, computing this might have to be delayed because
it depends on multiple pieces of state and there's no way to disentangle
them. Therefore we have to store the requisite state in the pipeline and
combine it later.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18554>
2022-09-21 11:20:15 +00:00
Connor Abbott
9eca3b12f6 tu: Move no_earlyz computation to blend/msaa state
This removes the last dependency of FS outputs on blend/MSAA state.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18554>
2022-09-21 11:20:15 +00:00
Connor Abbott
d6bf8efcdf tu: Emit *_OUTPUT_CNTL1 as part of blend state
This further decouples the fragment shader from the blend state.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18554>
2022-09-21 11:20:15 +00:00