Commit graph

3110 commits

Author SHA1 Message Date
Iago Toral Quiroga
1442861141 v3dv: fix comment for point_sprite_mask filed in shader key
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17486>
2022-07-13 05:20:31 +00:00
Emma Anholt
7976d558d5 vc4: Add links to test bug reports.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17449>
2022-07-12 17:15:43 +00:00
Emma Anholt
2f851f0479 vc4: Work around a HW bug with 2-vert line loops.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17449>
2022-07-12 17:15:43 +00:00
Emma Anholt
0f37e3c339 mesa: Fix the error check for VertexAttrib*.
It was checking "mesa's theoretical max attributes" rather than "the
driver's max attributes."

Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17449>
2022-07-12 17:15:43 +00:00
Eric Engestrom
9db1af8757 v3dv: use updated tokens from vk.xml
Signed-off-by: Eric Engestrom <eric@igalia.com>
Acked-by: Iago Toral Quiroga <itoral@igalia.com>
Acked-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17342>
2022-07-12 15:53:11 +00:00
Iago Toral Quiroga
f286289c7f v3dv: remove unused lowering for nir_intrinsic_load_layer_id
This intrinsic is only produced when the compiler is instructed
to handle layer id as a system value, which we don't use. Also,
we have been supporting layered rendering for a while and passing
all the relevant tests which would've failed if we were hitting
this lowering.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17483>
2022-07-12 11:47:13 +00:00
Iago Toral Quiroga
5a4c5f46c7 v3dv: fix comment in texel buffer shader copy path
When using the texel buffer copy path to copy a buffer we need to
sample from the buffer and for that we need a texture shader state
record where we specify the base offset of the texture (the buffer).
If the copy operation has a start offset we can't add that offset
to the base address of the buffer because the texture state record
requires the base pointer to be 64-byte aligned, so it would only
work for offsets that are multiple of 64B. Instead, we pass the
offset (in elements) to the shader and we use that to shift the
indices into the buffer when selecting the source texel to copy.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17482>
2022-07-12 10:48:45 +00:00
Iago Toral Quiroga
871a7536e8 broadcom/compiler: don't over-estimate latency of TMU instructions
Over-estimating latency can cause us to delay the critical paths of
the shader unnecessarily, producing larger QPU programs that take more
time to execute as a result (and it also adds register pressure) so
striking a balance is important. The thread switching model in V3D
is quite effective at hiding latency and usuallly we just need to
hint it to delay TMU instructions a little bit to find the best
compromise for performance.

The new latency numbers have been chosen empirically by testing
V3DV with Sponza and a few UE4 samples.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17451>
2022-07-11 10:34:58 +00:00
Iago Toral Quiroga
f227aa7c98 broadcom/compiler: don't try to hide TMU latency at QPU scheduling
Based on empirical testing with Sponza and a few UE4 samples this is
consistently slightly benefitial for performance.

The most likely reason why this helps is that thrsw is probably
already quite effective at hiding latency and we are already trying
to hide latency at NIR scheduling and also via TMU pipelining, so
piling up on this when scheduling QPU typically ends up providing no
benefit at all for latency and is instead possibly preventing us to
unblock critical paths in the shader that depend on the TMU result,
requiring us to execute more cycles to complete the program.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17451>
2022-07-11 10:34:58 +00:00
Emma Anholt
e9840e409f vc4: Add notes on the remaining dEQP failures.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17350>
2022-07-10 02:50:09 +00:00
Emma Anholt
48a9196632 vc4: Move previous existing 3D xfails up to the group of 3d xfails.
Clears up known issues from ones that should be investigated and
explained.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17350>
2022-07-10 02:50:09 +00:00
Emma Anholt
426c7b65db vc4: Disable OES_texture_3D being exposed.
The hardware doesn't support 3D textures.  We had been lying about 3D
texture level support in the past so that we got GL 2.1, but now reporting
levels==0 doesn't disable GL 2.1 (since we don't check for GL2 extensions
any more).  But, by not lying, we now fix the majority of the remaining
GLES2 deqp failures.

This regresses a few desktop GL piglits which get GL errors that they
notice instead of what would be silent rendering failures on 3D texturing
operations.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17350>
2022-07-10 02:50:09 +00:00
Iago Toral Quiroga
f4a3bccf94 v3dv: remove obsolete comment
multop + umul24 can only be used to implement 32-bit multiplies,
so for a full 64-bit result we always need to lower.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17372>
2022-07-07 09:16:24 +00:00
Iago Toral Quiroga
152fc4fd28 v3dv: don't lower uadd_carry and usub_borrow
We can produce slightly better code for these in the backend, so
do that. For this we need to:

1. Fix our implementation of uadd_carry (which wasn't used) to return
   an integer instead of a boolean value.
2. Add an implementation of usub_borrow.

Notice these are only used in Vulkan. In GL these instructions are
always unconditionally lowered by the state tracker in GLSL IR so
we never get to see them in the backend.

Shader-db stats from a collection of Vulkan samples:

total instructions in shared programs: 122351 -> 122345 (<.01%)
instructions in affected programs: 196 -> 190 (-3.06%)
helped: 2
HURT: 0

total uniforms in shared programs: 18670 -> 18672 (0.01%)
uniforms in affected programs: 59 -> 61 (3.39%)
helped: 0
HURT: 2

total max-temps in shared programs: 13145 -> 13147 (0.02%)
max-temps in affected programs: 27 -> 29 (7.41%)
helped: 0
HURT: 2

total inst-and-stalls in shared programs: 123052 -> 123046 (<.01%)
inst-and-stalls in affected programs: 197 -> 191 (-3.05%)
helped: 2
HURT: 0

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17372>
2022-07-07 09:16:24 +00:00
Iago Toral Quiroga
7dc951374c v3dv: fix merge jobs
This only works if the framebuffer config is exactly the same so
testing both subpasses have the same attachments is not enough,
they also need to be exactly in the same order.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17358>
2022-07-06 05:49:37 +00:00
Iago Toral Quiroga
7b91b39ba5 v3dv: fix pool descriptor count for inline uniform buffers
Fixes VK_ERROR_OUT_OF_POOL_MEMORY in the inlineuniformblocks
sample from Sascha Willems.

Fixes: ea3223e7a4 ('v3dv: implement VK_EXT_inline_uniform_block')
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17311>
2022-07-01 11:12:39 +00:00
Eric Engestrom
c06926f694 broadcom/rpi4-skips: drop duplicated lines
Signed-off-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17312>
2022-07-01 08:09:48 +00:00
Juan A. Suarez Romero
037e7e8066 v3d/ci: Add flake test
This test works when executed alone, but fails when running the full
GLES3 CTS.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17300>
2022-06-29 14:01:20 +02:00
Boris Brezillon
a8cd159538 v3dv: Use vk_pipeline_hash_shader_stage()
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17186>
2022-06-28 09:07:32 +00:00
Boris Brezillon
863b6317a3 v3dv: Fix nir_shader leaks in v3dv_meta_{clear,copy}()
Reported-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17186>
2022-06-28 09:07:32 +00:00
Iago Toral Quiroga
cfccd93efc broadcom/compiler: don't predicate postponed spills
The postponed spill is predicated using the condition from the
last write, but this is only correct if the register was only
written once in the TMU sequence, or if it is always written with
the same predication.

While we could try to track whether this is the case or not, it
would make the postponed spill path even more complex than it
already is, so let's just avoid predicating these. We are already
discouraging TMU spilling of registers in the middle of TMU
sequences, so this should not be a very common case.

Cc: mesa-stable
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17201>
2022-06-28 05:49:51 +00:00
Iago Toral Quiroga
98420408d0 broadcom/compiler: fix postponed TMU spills with multiple writes
If we are spilling a register that is used in the middle of a TMU
sequence, we postpone the spill until the TMU sequence finishes,
at which point we inject the spill and rewrite the original
instruction to write to the new temp.

However, this doesn't work if the register is written multiple
times during the TMU sequence. In that scenario, we need to ensure
that all writes are rewritten to use the new temp, not just the last
one.

Cc: mesa-stable
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17201>
2022-06-28 05:49:51 +00:00
Iago Toral Quiroga
0bc65b1d81 v3dv: fix leak
Cc: mesa-stable
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17201>
2022-06-28 05:49:51 +00:00
Ella Stanforth
f392b6c1ad v3dv: Implement VK_KHR_performance_query
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14061>
2022-06-27 07:34:16 +00:00
Emma Anholt
13bf36588d ci/bare-metal: Consolidate needs declarations in .baremetal-test-*.
We had it set up for arm64 asan already, do it for everyone else too.  In
cleaning up the duplication, this fixes a pasteo in rpi3 which had the
"artifacts: false" on the wrong job, causing it to do a slow download of
the mesa build from gitlab.

Doing this required also moving the ".use-debian/arm_test" in as well, so
that its "needs:" didn't overwrite ours if it appeared after us in the
consumer's "extends:"

Should save about 20 seconds on rpi3 jobs.

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17146>
2022-06-22 20:59:54 +00:00
Emma Anholt
4309e09d6f vc4: Propagate txf_ms's dest_type to the lowered txf.
This was missing, and the added validation caught it.

Fixes: 708c47e663 ("nir: Validate nir_tex_instr::dest_type bitsize")
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17172>
2022-06-22 07:10:18 -07:00
Emma Anholt
1de87497ba ci/vc4: Turn on deqp-egl testing by default.
Now that we have one less job, let's flip this on.

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17172>
2022-06-22 07:10:14 -07:00
Emma Anholt
e9fad0b9aa ci/vc4: Merge quick_shader in with deqp-gles
All 4 jobs had a total of about 26 minutes of runner time, so squish them
onto 3 runners and use gbm for the .shader_tests to avoid X overhead and
hopefully succeed with full concurrency.

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17172>
2022-06-22 07:09:53 -07:00
Emma Anholt
5f09b1ebe9 ci/bare-metal: Add test phase timeouts to all boards.
This should help with "marge got stuck for an hour and all I got was this
failed job with no results/" when a system intermittently wedges.

This replaces the BM_POE_TIMEOUT ("did we get something on serial in the
last 3 minutes?") that rpi had, in favor of checking that the whole test
job gets through in 20 minutes.

Acked-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17096>
2022-06-21 21:38:25 +00:00
Juan A. Suarez Romero
c0626a6bd2 v3dv/ci: Update expected results
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17141>
2022-06-20 15:37:39 +00:00
Jose Maria Casanova Crespo
901f5e6a31 v3dv/ci: increase fraction to 10 on v3dv ci jobs.
We reduce the v3dv ci jobs time execution from ~20min to
8-11 min.

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17026>
2022-06-14 20:33:34 +00:00
Alejandro Piñeiro
51bdac4846 v3dv/pipeline: expand nir_optimize, drop st_nir_opts
Right now we had two methods that tries to optimize the nir shader,
nir_optimize and st_nir_opts. The latter is being used when we are
linking, but again, it has basically the same purpose that
nir_optimize.

So this commit adds more lowerings to nir_optimize_nir, add some extra
comments on the method, and replaces st_nir_opts with nir_optimize.

Ideally we would like to just use the already existing
v3d_optimize_nir that we have at the backend But:
   * Using it leads to some regressions on Vulkan CTS tests, due some
     lowerings that are already there.
   * We would need to move to the backend some additional
     lowerings/optimizations that are used on the Vulkan
     frontend. That would require to check that we are not getting any
     regression or performance drop on OpenGL

So for now we are keeping a Vulkan specific nir_optimize method.

Additionally this fixes the following test:
dEQP-VK.graphicsfuzz.cov-loop-condition-clamp-vec-of-ones

Shaderdb stats, using some well known Vulkan apps (ue4 demos, Quake3e,
etc):

 total instructions in shared programs: 124974 -> 125108 (0.11%)
 instructions in affected programs: 50328 -> 50462 (0.27%)
 helped: 4
 HURT: 79

 total uniforms in shared programs: 19019 -> 19020 (<.01%)
 uniforms in affected programs: 60 -> 61 (1.67%)
 helped: 0
 HURT: 1

 total max-temps in shared programs: 13438 -> 13444 (0.04%)
 max-temps in affected programs: 85 -> 91 (7.06%)
 helped: 0
 HURT: 2

 total inst-and-stalls in shared programs: 125715 -> 125849 (0.11%)
 inst-and-stalls in affected programs: 50429 -> 50563 (0.27%)
 helped: 4
 HURT: 79

 total nops in shared programs: 8203 -> 8204 (0.01%)
 nops in affected programs: 732 -> 733 (0.14%)
 helped: 7
 HURT: 9

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16986>
2022-06-14 13:12:46 +00:00
Alejandro Piñeiro
36c547342a v3dv/pipeline: call nir_lower_explicit_io after first nir optimization loop
That is what most others Vulkan drivers do (radv, anv, turnip at
least).

The origin of this change cames from a CTS test where the loop
unrolling converted a ubo index defined inside a loop from constant to
non constant. That is not desiderable on any driver, but a problem on
v3dv, as v3dv doesn't support that case.

Although we initially tried to fix it on the loop unroll, we discarded
that approach, and focused on the existing nir lowerings/optimizations
as this was not happening with other drivers.

We noted that in other drivers this case of a ubo index going from
const to non-const were also happening with nir_lower_explicit_io, but
in that case it was able to be converted back to a const on following
lowerings. The only difference with other drivers is that we were
calling it before the first nir optimization loop.

So this change helps with fixing the following CTS test (for that we
also need to run additional lowerings, which we do in a later patch):
   dEQP-VK.graphicsfuzz.cov-loop-condition-clamp-vec-of-ones

You can get further details on the following issue and RFC merge
request, specially the merge request:
  https://gitlab.freedesktop.org/mesa/mesa/-/issues/6051
  https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15391

We also made some shaderdb stats with our usual Vulkan apps (ue4
demos, quake3, etc):

 Total instructions in shared programs: 125014 -> 124974 (-0.03%)
 instructions in affected programs: 7544 -> 7504 (-0.53%)
 helped: 7
 HURT: 4

 total uniforms in shared programs: 19026 -> 19019 (-0.04%)
 uniforms in affected programs: 514 -> 507 (-1.36%)
 helped: 5
 HURT: 0

 total max-temps in shared programs: 13430 -> 13438 (0.06%)
 max-temps in affected programs: 270 -> 278 (2.96%)
 helped: 0
 HURT: 8

 total sfu-stalls in shared programs: 739 -> 741 (0.27%)
 sfu-stalls in affected programs: 30 -> 32 (6.67%)
 helped: 0
 HURT: 2

 total inst-and-stalls in shared programs: 125753 -> 125715 (-0.03%)
 inst-and-stalls in affected programs: 7685 -> 7647 (-0.49%)
 helped: 7
 HURT: 4

 total nops in shared programs: 8228 -> 8203 (-0.30%)
 nops in affected programs: 546 -> 521 (-4.58%)
 helped: 9
 HURT: 2

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16986>
2022-06-14 13:12:46 +00:00
Iago Toral Quiroga
4a7446e4e4 v3dv: handle barriers at the end of a command buffer
Since we only consume barriers at the beginning of a new job, if
a command buffer ends with a barrier we will not handle it. Fix
this by emitting a noop job  in that case to consume it. Ideally,
we could do better and check the pending barrier state to fine
tune the noop job so we don't wait on all queues, but for now
this fixes flakyness with some CTS pipeline barrier tests that
started to show up after we optimized binning sync barriers. It
is likely that the additional sync we had before that change was
enough to prevent the problem from showing up.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17020>
2022-06-14 11:30:33 +00:00
Iago Toral Quiroga
d6702b99a2 v3dv: merge pending secondary barrier state into primary command buffers
When we switched to using structs to track barrier state we made a mistake
and started to overwrite barrier state in primary command buffers with
the pending state from secondary command buffers executed inside them, when we
should've been merging the state instead.

Fixes flakyness with some CTS barrier tests.

Fixes: f7ce42636c ('v3dv: use an explicit struct type to track barrier state')
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17020>
2022-06-14 11:30:33 +00:00
Iago Toral Quiroga
a97f78eb14 broadcom/compiler: disable flags optimization for loop conditions
This is not safe because it may skip regenerating the flags for the
loop condition in the loop continue block and these flags may be
stomped in the loop body by other conditionals.

Fixes: 9909fe6ba ('broadcom/compiler: Skip bool_to_cond where possible')
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17020>
2022-06-14 11:30:33 +00:00
Jason Ekstrand
3ed70d775c v3dv: Use the common AcquireNextImage implementation
The only reason for the wrapper was so that we could dummy signal the
semaphore and fence.  Now that the WSI code always dos this for us, we
can drop our wrapper.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4037>
2022-06-10 01:33:12 +00:00
Juan A. Suarez Romero
8f3c60a93d v3d/ci: Add traces
Add a job to run and test traces from Tracies DB.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16809>
2022-06-06 15:18:50 +00:00
Erik Faye-Lund
873ec432b3 broadcom/compiler: use macro for power-of-two check
This will allow the use of static_assert here instead of our
compiler-specific implementation.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16670>
2022-06-03 07:14:43 +00:00
Iago Toral Quiroga
18985e8030 v3dv: use the global RCL EZ disable if we don't have any EZ draws in the job
Until now we would only disable EZ globally if we had a depth or stencil
load operation or if we had no draw calls at all, but even if we have draw
calls if all of them disable EZ we should also us the global disable.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16794>
2022-06-01 08:11:04 +02:00
Iago Toral Quiroga
0f65838933 v3dv: don't be too aggressive disabling early Z
When we have a draw call that is incompatible with EZ we should only
disable EZ for the remaining of the job in the case that both of the
following conditions are met:

1. The cause for the incompatibility is an incompatible depth test
   direction.

2. The pipeline does Z writes.

Otherwise it is enough to disable EZ temporarily only for draw calls
with the incompatible pipeline.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16794>
2022-06-01 08:10:57 +02:00
Juan A. Suarez Romero
4357dff4e9 v3d: fix blending for mixed RT formats
Blending configuration needs to be adapted in case the RT format does
not have an alpha channel. This is handled so far correctly.

But when we have two RT, one with alpha and other without it, we need
to split the blend configuration, so one is adapted and the other not.

Otherwise we would be changing the blend config for the wrong RT.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16747>
2022-05-31 17:25:50 +00:00
Juan A. Suarez Romero
836ce97f5e ci: bump VK-GL-CTS to 1.3.2.0
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Acked-by: Alejandro Piñeiro <apinheiro@igalia.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16689>
2022-05-31 15:02:08 +00:00
Iago Toral Quiroga
0ce346368f v3dv: limit sync for barriers to hw queues selected by source mask
Until know when we consumed a barrier we would implement it by
setting the serialize flag on a job, which would cause it to
be serialized across all hardware queues (CL, CSD, TFU). However,
now that we track the source(s) of the barrier, we can restrict this
to only the relevant queue(s) instead (multisync path only).

It should be noted that we can implement transfers via TFU or CL
jobs, so if the source of a barrier is a transfer, we currently
synchronize against both the TFU and the CL queues, however, we
may be able to more effectively track this in the future to
restrict this to just one of the queues.

Also, for secondary command buffers we are taking the easy way
out and always synchronize against all queues, but we should
be able to do the same for secondaries without too much effort.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16743>
2022-05-31 06:06:10 +00:00
Iago Toral Quiroga
ad249e9020 v3dv: track sources of barriers
Until now we have been tracking the dstStageMask of barriers (where they
are consumed) but not where they are produced (the srcStageMask). With
this change we extend our barrier state to keep track of this as well.

This allows the driver to have better knowledge of the intended barrier
semantics so it can limit the amount of synchronization it does only
to the source stages involved with a barrier. We will do this in a
later patch.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16743>
2022-05-31 06:06:10 +00:00
Iago Toral Quiroga
f7ce42636c v3dv: use an explicit struct type to track barrier state
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16743>
2022-05-31 06:06:10 +00:00
Iago Toral Quiroga
eccc0e6a0b v3dv: only clear BCL barrier state if we don't have pending graphics barriers
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16743>
2022-05-31 06:06:10 +00:00
Iago Toral Quiroga
24ebcbbaa7 v3dv: consume barriers at the right stages
Until now, we have always consumed barriers with the next GPU job
recorded into the command buffer after the barrier even if the job
was not the target of the barrier itself. This works based on the
idea that when we consume a barrier in a job we serialize it against
all queues, so effectively we are ensuring that whatever came
before it has completed, so if the barrier was intended for an
even later job, it would have served its purpose anyway.

It should be noted that CL jobs are special because they are actually
split in two different queues: the binning queue and the render
queue, with a dependency between them to ensure render runs after
binning. With our current implementation, if we have 3 jobs (A, B,
C) and we have a barrier after job A which is intended to block job C
on A's completion, with our implementation we would instead block
B on A's completion. If C is a CL job, and the barrier was targetting
the binning stage then we can have the following scenarios:

1. If B) is a CL job, it will consume the barrier at its binning
   stage, so we know that B's binning will not start until A has
   completed. Then C's binning will not start until B's binning
   has completed, and thus, will not start until A has completed,
   as intended.

2. If B) is not a CL job, it will consume the barrier and will not
   start until A has completed, however, C's binning job will be
   submitted to the binning queue without any sync requirements
   and since B did not put any jobs in the binning queue it will
   start as soon as A's binning has completed, but not A's render,
   which would be incorrect.

Further, since a981ac0539 we now skip consumming BCL barriers if
a job does not have draw calls that can be affected by them. In the
same scenarios as before, now case 1) would also be problematic,
since B may skip the binning sync in that case and start immediately,
and since C's binning would be allowe to start immediately after B's
binning, there is no guarantee that this doesn't happen in parallel
with A's render.

With this patch we fix this situation by tracking the intended
consumer of each barrier: graphics, compute or transfer, and we make
sure to consume them only with jobs that match those semantics.

This fixes flakyness in dEQP-VK.device_group.*

Fixes: a981ac0539 ('v3dv: skip binning sync if binning shaders don't access external resources')
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16743>
2022-05-31 06:06:10 +00:00
Alejandro Piñeiro
746287d221 v3dv/format: Add support for VK_KHR_format_feature_flags2
VK_KHR_format_feature_flags2 is mostly about define a new 64-bit
VkFormatFeatureFlagBits2KHR format feature flag type, as 29 bits of
the 32-bit VkFormatFeatureFlagBits are already in use.

So all the bits from VkFormatFeatureFlagBits are being replicated, and
most of the work here consist on switch to the new flags.

From the new (not replicated from VkFormatFeatureFlagBits) flag bits,
we don't support
VK_FORMAT_FEATURE_2_STORAGE_READ_WITHOUT_FORMAT_BIT_KHR or
VK_FORMAT_FEATURE_2_STORAGE_WRITE_WITHOUT_FORMAT_BIT_KHR, as right now
we require the format on the shader for doing the read and stores.

We use now VK_FORMAT_FEATURE_2_SAMPLED_IMAGE_DEPTH_COMPARISON_BIT_KHR,
but only applying it for depth formats.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16718>
2022-05-26 21:20:50 +00:00
Alejandro Piñeiro
11a0ea76a2 v3dv/format: no need for GetPhysicalDeviceFormatProperties
The common Mesa Vulkan framework already provides a common
implementation based on GetPhysicalDeviceFormatProperties2.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16718>
2022-05-26 21:20:50 +00:00