Commit graph

1479 commits

Author SHA1 Message Date
Iago Toral Quiroga
370495abd1 v3d: disable GLSL loop unrolling again
We had re-enabled this because of some test regressions:

KHR-GLES31.core.geometry_shader.limits.max_input_components and
ext_transform_feedback-max-varyings failed to register allocate,
but now that we support indirect indexing on vertex shader outputs natively
this is no longer an issue.

Piglit's max-samplers tests failed. These tests use indirect indexing
on samplers which is not supported and fail to link with this error message:
"Failed to link: error: sampler arrays indexed with non-constant expressions
is forbidden in GLSL  110". This is expected. The reason these were passing
before is that loop unrolling was able to turn indirect indexing into
direct indexing. We add them to the expected fail list.

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10723>
2021-05-11 09:31:31 +00:00
Iago Toral Quiroga
f0fef41917 broadcom/compiler: don't unroll due to indirect indexing of outputs
We can handle this natively now, so there is no point.

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10723>
2021-05-11 09:31:31 +00:00
Iago Toral Quiroga
9f5481cf78 v3dv: don't lower indirect derefs on output variables
Our backend compiler can handle this for all supported shader stages now.

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10723>
2021-05-11 09:31:31 +00:00
Iago Toral Quiroga
0235ed18a7 broadcom/compiler: don't use nir_src_is_dynamically_uniform
Now that we have divergence analysis we should use that.

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10723>
2021-05-11 09:31:31 +00:00
Iago Toral Quiroga
cb39dca2d3 broadcom/compiler: make vir_VPM_WRITE_indirect handle non-uniform offsets
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10723>
2021-05-11 09:31:31 +00:00
Iago Toral Quiroga
f71893a942 broadcom/compiler: implement non-uniform offset on vertex outputs
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10723>
2021-05-11 09:31:31 +00:00
Iago Toral Quiroga
067ad7eccc broadcom/compiler: move vertex shader output handling to its own function
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10723>
2021-05-11 09:31:31 +00:00
Juan A. Suarez Romero
54ec9c95cf broadcom/compiler: fix dynamic-stack-buffer-overflow error
When spilling a register, the number of temps can be increased when
introducing a temporal variable.

Those nodes are not elegible to be spilled, but we need to take care of
no accessing out-of-bounds of the arrays defined with a size equal to
the original number of temps.

Fixes address sanitizer error on
KHR-GLES3.shaders.uniform_block.random.all_shared_buffer.14 (and many
others).

v2 (Iago):
 - Add clarification in assertion.
 - Use `vir_get_temp` to increase num_temps.

v3 (Iago):
 - Update clarification

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10643>
2021-05-11 07:46:17 +00:00
Juan A. Suarez Romero
ea463f9bff ci/broadcom: update expected results
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10694>
2021-05-10 10:11:09 +00:00
Iago Toral Quiroga
d81a6e5f1d broadcom/compiler: change register allocation policy for accumulators
The current policy is to always favor accumulators if possible, however,
this is not always optimal.

Particularly, accumulators play a crucial role in enabling QPU instruction
merges, since these are limited to both the ADD and the ALU instructions
addressing at most 2 physical registers. For 2-src instructions, this means
that to be able to merge we need them to address at least 2 accumulators.

While favoring accumulators does help the case for instruction merges in
general, it is risky to assign accumulators to variables that have
long life spans. Doing so will make the accumulator unavailable for
any other instructions during that life span, and since we only have a few
accumulators, we can quickly run out and losing our capacity to merge
instructions for large parts of the qpu program.

On the other hand, we also want to avoid the extreme case were we keep
allocating physical registers to the point we run out, even if we have
accumulators available, since accumulators have additional restrictions
and may not be suitable for everything.

This change continues the policy of favoring accumulators, but it only
does so if the life span of the temps is short, to ensure that we can
recycle accumulators often across instructions and avoid running out
for sections of the QPU code, unless we are already running out of
physical registers.

total instructions in shared programs: 13654647 -> 13336921 (-2.33%)
instructions in affected programs: 11015919 -> 10698193 (-2.88%)
helped: 39758
HURT: 17325
Instructions are helped.

total threads in shared programs: 412046 -> 412038 (<.01%)
threads in affected programs: 16 -> 8 (-50.00%)
helped: 0
HURT: 4
Threads are HURT.

total uniforms in shared programs: 3745726 -> 3746003 (<.01%)
uniforms in affected programs: 17296 -> 17573 (1.60%)
helped: 76
HURT: 99
Uniforms are HURT.

total max-temps in shared programs: 2364430 -> 2359942 (-0.19%)
max-temps in affected programs: 109117 -> 104629 (-4.11%)
helped: 2893
HURT: 772
Max-temps are helped.

total spills in shared programs: 5727 -> 5746 (0.33%)
spills in affected programs: 221 -> 240 (8.60%)
helped: 1
HURT: 2

total fills in shared programs: 13121 -> 13139 (0.14%)
fills in affected programs: 466 -> 484 (3.86%)
helped: 1
HURT: 2

total sfu-stalls in shared programs: 33432 -> 34491 (3.17%)
sfu-stalls in affected programs: 18219 -> 19278 (5.81%)
helped: 4459
HURT: 5087
Inconclusive result

total inst-and-stalls in shared programs: 13688079 -> 13371412 (-2.31%)
inst-and-stalls in affected programs: 11030017 -> 10713350 (-2.87%)
helped: 39630
HURT: 17429
Inst-and-stalls are helped.

total nops in shared programs: 335753 -> 333708 (-0.61%)
nops in affected programs: 112659 -> 110614 (-1.82%)
helped: 8726
HURT: 7383
Inconclusive result

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10686>
2021-05-08 13:15:42 +02:00
Iago Toral Quiroga
c11e479852 broadcom/compiler: specify maximum thread count in compile strategies
Once we have exhausted compile strategies at 4 threads and we start
enabling lower thread counts, there is no point in starting compiles
with 4 threads for them, we know these will fail, so let's start at
2 in these cases.

This also has another nice implication: if the driver compiles at 4
threads and fails to register allocate, we were allowing it to try
with 2 threads, but this would only retry the register allocation
process and would not really recompile the shader with 2 threads. This
is not optimal, because at 2 threads we have more TMU fifo space for
each thread and we can do more TMU pipelining, so we were missing that
opportunity.

This improves performance in Sponza by ~1.5% and also seems to help
UE4 slightly.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10647>
2021-05-06 12:27:06 +02:00
Iago Toral Quiroga
d19ce36ff2 broadcom/compiler: refactor compile strategies
Until now, if we can't compile at 4 threads we would lower thread count
with optimizations disabled, however, lowering thread count doubles the
amount of registers available per thread, so that alone is already a big
relief for register pressure so it makes sense to enable optimizations
when we do that, and progressively disable them until we enable spilling
as a last resort.

This can slightly improve performance for some applications. Sponza,
for example, gets a ~1.5% boost. I see several UE4 shaders that also get
compiled to better code at 2 threads with this, but it is more difficult
to assess how much this improves performance in practice due to the large
variance in frame times that we observe with UE4 demos.

Also, if a compiler strategy disables an optimization that did not make
any progress in the previous compile attempt, we would end up re-compiling
the exact same shader code and failing again. This, patch keeps track of
which strategies won't make progress and skips them in that case to save
some CPU time during shader compiles.

Care should be taken to ensure that we try to compile with the default
NIR scheduler at minimum thread count at least once though, so a specific
strategy for this is added, to prevent the scenario where no optimizations
are used and we skip directly to the fallback scheduler if the default
strategy fails at 4 threads.

Similarly, we now also explicitly specify which strategies are allowed to do
TMU spills and make sure we take this into account when deciding to skip
strategies. This prevents the case where no optimizations are used in a
shader and we skip directly to the fallback scheduler after failing
compilation at 2 threads with the default NIR scheduler but without trying
to spill first.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10647>
2021-05-06 12:27:06 +02:00
Iago Toral Quiroga
296fe4daa6 broadcom/compiler: add a compiler strategy to disable loop unrolling
Loop unrolling can increase register pressure significantly, leading to
lower thread counts and spilling.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10647>
2021-05-06 12:27:06 +02:00
Iago Toral Quiroga
4742300e6b v3d: move NIR compiler options to GL driver
The Vulkan driver was already creating and using its own set of options, so
the ones defined in the compiler are only used with GL, which is confusing.
Move them to the GL driver.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10647>
2021-05-06 12:27:06 +02:00
Iago Toral Quiroga
db3fa1cc8c v3dv: setup loop unrolling
We set the maximum at 16 iterations (the GL compiler chooses 32
iterations for the GLSL front-end loop unrolling pass) because we
have observed a bunch of shaders from Sascha Willems that spill
significantly with 32, leading to massive performance degradation,
while 16 avoids spilling and doesn't seem to cause visible
performance degradation compared to cases that unroll 32 without
spilling.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10647>
2021-05-06 12:25:22 +02:00
Iago Toral Quiroga
ec72b876fe broadcom/compiler: add a loop unrolling pass
Right now this is useful for Vulkan onnly, because GL gets loop
unrolling from the GLSL compiler and/or mesa state tracker
NIR front-ends.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10647>
2021-05-06 12:24:29 +02:00
Iago Toral Quiroga
f099fc3e07 v3d: choose a larger CSD supergroup size if possible
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10541>
2021-05-04 15:53:23 +00:00
Iago Toral Quiroga
3ce249e65e broadcom/common: move CSD supergroup sizing to a common helper
We want to use this in GL too.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10541>
2021-05-04 15:53:23 +00:00
Iago Toral Quiroga
afc33a7430 v3dv: limit supergroup size in presence of TSY barriers
When a TSY barrier is hit, the entire supergroup will be synchronized.
If the supergoup is large and uses all available QPU threads it would
mean that we would sychronize and stall all running threads until all
of them reach the barrier, which may be inefficient.

This patch makes it so that if the compute shader has any such barriers
we limit the supergroup size so each supergroup only takes half of the
QPU threads available at most, so that if one supergroup hits a
barrier we have at least one other supergroup we can run, reducing
idle QPU time.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10541>
2021-05-04 15:53:23 +00:00
Iago Toral Quiroga
f514280524 broadcom/compiler: track if a shader has control barriers in prog_data
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10541>
2021-05-04 15:53:23 +00:00
Iago Toral Quiroga
2e0f6e5705 v3dv: choose a larger CSD supergroup size if possible
Each supergroup executes a number batches. Each batch has 16 elements
(one per QPU lane), except possibly the last batch which might be
incomplete. Until now, we packed a single workgroup in each supergroup,
which can lead to more incomplete batches and less efficient use
of the QPUs depending on the configuration of workgroups being dispatched.

This patch computes a number of workgroups per supergroup so that
we reduce or completely eliminate incomplete batches if possible.

It should be noted however, that TSY barriers act on supergroups,
so larger supergroups lead to larger syncpoints on barriers too.
A follow-up patch will try to find a good balance for compute shaders
that use such barriers.

This improves performance of the Sascha Willem's computecloth demo
by ~13%.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10541>
2021-05-04 15:53:23 +00:00
Jose Maria Casanova Crespo
ab1d66a111 ci/v3d: Update piglit expectations.
As piglit job is manual, I forgot to update three new test passing at
spec@ext_image_dma_buf_import subgroup after merging
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10524

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10536>
2021-04-30 10:22:53 +02:00
Juan A. Suarez Romero
33f9b06b0e v3dv: check dest bitsize in color blit
Otherwise, if src_bit_size > 0 and dst_bit_size == 0, we end up doing a
bad shift in `1 << (dst_bit_size - 1)`, as `dst_bit_size - 1` is a
negative value (in this case would be MAX_UINT32).

Fixes CID#1468134 "Bad bit shift operation (BAD_SHIFT)":
   "large_shift: In expression 1 << dst_bit_size - 1U, left shifting by
    more than 31 bits has undefined behavior. The shift amount,
    dst_bit_size - 1U, is 4294967295."

v2:
 - Use an assertion instead (Iago)

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10251>
2021-04-29 10:31:11 +00:00
Juan A. Suarez Romero
fd8d71ce41 v3dv: rename VC5 to V3D
As we are not using anymore references to the old VC5, let's rename
definitions from VC5 to V3D in the Vulkan driver.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10402>
2021-04-29 11:22:12 +02:00
Juan A. Suarez Romero
26618dfb87 broadcom/simulator: change references to VC5
We are referring the driver as V3D instead old VC5; so let's update the
references.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10402>
2021-04-29 11:22:12 +02:00
Juan A. Suarez Romero
a77002584d broadcom/qpu: rename from VC5 to V3D
Get rid of old references to VC5.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10402>
2021-04-29 11:22:12 +02:00
Juan A. Suarez Romero
dfe50e02c9 ci/broadcom: update expected results
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10501>
2021-04-28 15:21:28 +00:00
Alejandro Piñeiro
79e4451430 v3dv: move extensions table to v3dv_device
So one less python generator. Based on anv (MR#8792) and radv
(MR#8900).

With this change v3dv doesn't have any more a custom python code
generator.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10484>
2021-04-28 09:13:55 +00:00
Alejandro Piñeiro
8d72992ed5 v3dv: remove custom icd json generation
Most of the stuff needed was moved to vk util. So one less python
generator to maintain.

anv and radv already migrated.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10484>
2021-04-28 09:13:55 +00:00
Juan A. Suarez Romero
2da01e6123 ci/v3dv: update flakes
Add
dEQP-VK.synchronization.op.single_queue.binary_semaphore.write_copy_buffer_read_ssbo_compute.buffer_16384
to the flake list.

v2:
 - Add a new flake (jasuarez)

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10479>
2021-04-28 07:38:53 +00:00
Juan A. Suarez Romero
9be055a22a ci/v3d: fix typo in job name
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10479>
2021-04-28 07:38:53 +00:00
Iago Toral Quiroga
d636c5660c v3dv: implement wsi hook to decide if we can present directly on device
This will prevent the driver to take the prime blit path for presentation
in scenarios where it can avoid it, which can substantially improve
performance, particularly at high resolutions.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5917>
2021-04-27 06:37:43 +00:00
Juan A. Suarez Romero
c93bd731f8 v3dv/pipeline_cache: bail out in case of error
Currently, in GetPipelineCacheData() function, in several cases if
there is an error the blob is finished and cache unlocked, but code
continues executing, which can lead to multiple
`pthread_mutex_unlock()` calls.

Instead, if there's an error just bail out to finish the blob and unlock
the cache directly.

Fixes CID#1468147 "Double unlock (LOCK)".

v2:
 - Rename "bail_out" by "done" (apinheiro)

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10404>
2021-04-22 11:40:40 +02:00
Juan A. Suarez Romero
84c64a0713 ci/v3d: execute all piglit tests
Most of the regressions we found are with the piglit testsuite. The
difference between executing all tests versus quick_gl + quick_shader
are minimal, in the sense that we would need the same number of jobs to
execute and be in the 10 minutes budget.

Hence replace "quick_gl" + "quick_shader" by "all".

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10378>
2021-04-22 07:53:39 +00:00
Juan A. Suarez Romero
796cb1e9d5 v3dv: check returned values
Check if v3dv_ioctl() or v3dv_bo_map() fail, and print a proper error
message.

This check happens in the rest of the code, so it makes sense to apply
here too.

Fixes CID#1468162 "Unchecked return value (CHECKED_RETURN)".

v2:
 - Fix message error (Iago)

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10380>
2021-04-22 07:39:24 +00:00
Juan A. Suarez Romero
ccfe3e4af5 ci/broadcom: add EGL testing jobs
Reviewed-by: Eric Anholt <eric@anholt.net
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10352>
2021-04-21 07:33:35 +00:00
Juan A. Suarez Romero
37e7725a5e ci/vc4: add KHR-GLES2.* job test
Reviewed-by: Eric Anholt <eric@anholt.net
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10352>
2021-04-21 07:33:35 +00:00
Juan A. Suarez Romero
79a0eee2fb ci/broadcom: update expected results
Reviewed-by: Eric Anholt <eric@anholt.net
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10352>
2021-04-21 07:33:35 +00:00
Anuj Phogat
12099d51f6 intel: Rename gen_10 to ver_10
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10241>
2021-04-20 20:06:34 +00:00
Juan A. Suarez Romero
0dde87457e ci/v3d: add KHR-GLES test jobs
Add (manually launched) jobs for KHR-GLES2.*, KHR-GLES3.* and
KHR-GLES31.*

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10287>
2021-04-20 12:39:27 +02:00
Alejandro Piñeiro
f5133f6bce v3dv/pipeline: track descriptor maps per stage, not per pipeline
One of the conclusions of our recent clean up on the limits was that
the pipeline limits needed to be the per-stage limits multiplied by
the number of stages.

But until now we only have a set of descriptor maps for the full
pipeline. That would work if we could set the same limit per pipeline
that per stage, but that is not the case. So if, for example, we have
the fragment shader using V3D_MAX_TEXTURE_SAMPLERS textures, and then
the vertex shader, with a different descriptor set, using one texture,
we would get an index greater that V3D_MAX_TEXTURE_SAMPLERS. We assert
that index as an error on the vulkan backend, but fwiw, it would be
also asserted on the compiler.

With this commit we track and allocate a descriptor map per stage,
although we reuse the vertex shader descriptor map for the vertex bin.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10272>
2021-04-19 23:10:35 +00:00
Iago Toral Quiroga
6c80b084f2 v3dv: better tracking of dirty push constant state
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10283>
2021-04-16 12:29:11 +00:00
Iago Toral Quiroga
30f125f04f v3dv: dirty viewport doesn't affect fragment shaders
The uniform state for the viewport is only used with geometry stages.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10283>
2021-04-16 12:29:11 +00:00
Iago Toral Quiroga
35ff75701f v3dv: improve dirty descriptor set state tracking
We were using the pipeline layout to discard uniform updates for
stages that don't use descriptors, but we can do better by keeping
track of the stages used by the specific dirty descriptor sets and
only update uniforms for stages that are included in those.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10283>
2021-04-16 12:29:11 +00:00
Juan A. Suarez Romero
d29b5b9f20 v3dv: avoid dereferencing null value
Fixes CID#1468079 "Dereference null return value (NULL_RETURNS)"

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10280>
2021-04-16 11:12:31 +00:00
Iago Toral Quiroga
1cf36797bf v3dv: fix sRGB blending workaround
This workaround needs to set a flag in the current job but it was
implemented at pipeline binding time, which can happen outside a
render pass. Move it to the pre-draw handler, where it belongs.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4645
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10255>
2021-04-16 06:05:59 +00:00
Adam Jackson
fc9b3b260e Revert "glx: Lift sending the MakeCurrent request to top-level code"
This provokes crashes in Cinnamon for some reason that I haven't
diagnosed yet.

This reverts commit 80b67a3b44.

Fixes: 80b67a3b44 glx: Lift sending the MakeCurrent request to top-level code
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4639
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10260>
2021-04-15 19:48:10 +00:00
Michel Dänzer
d200f45875 Use explicit break instead of fall-through to break-only case
clang generates a warning if there's no explicit break or fall-through
annotation. The latter would be kind of silly in this case, and not
robust against any future changes turning the fall-through invalid.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>
2021-04-15 16:01:22 +00:00
Iago Toral Quiroga
bed3f31fc6 v3dv: don't use a dedicated BO for each occlusion query
Dedicated BOs waste memory and are also a significant cause of CPU
overhead when applications use hundreds of them per frame due to
all the work the kernel has to do to page in all these BOs for a job.
The UE4 Vehicle demo was hitting this causing it to freeze and stutter
under 1fps.

The hardware allows us to setup groups of 16 queries in consecutive
4-byte addresses, requiring only that each group of 16 queries is
aligned to a 1024 byte boundary. With this change, we allocate all
the queries in a pool in a single BO and we assign them different
offsets based on the above restriction. This eliminates the freezes
and stutters in the Vehicle sample.

One caveat of this solution is that we can only wait or test for
completion of a query by testing if the GPU is still using its BO,
which basically means that we can only wait for all active queries
in a pool to complete and not just the ones being requested by the
API. Since the Vulkan recommendation is to use a different query
pool per frame this should not be a big issue though.

If this ever becomes a problem (for example if an application does't
follow the recommendation and instead allocates a single pool and
splits its queries between frames), we could try to group queries
in a pool into a number of BOs to try and find a balance, but for
now this should work fine in most cases.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10253>
2021-04-15 12:45:07 +00:00
Iago Toral Quiroga
917049e7d6 v3dv: fix array sizes when tracking BOs during uniform setup
The resource indices we get point to descriptor map entries that include
all shader stages, so we need to size the arrays to account for more than
just one stage.

For now we only support up to 2 stages in a pipeline, so we use that.

Fixes: 002304482c ('v3dv: avoid redundant BO job additions for UBO/SSBO')
Fixes: fa170dab4c ('v3dv: avoid redundant BO job additions for textures and samplers')
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10252>
2021-04-15 11:26:04 +00:00