We precompile static state and count it as dynamic, so we have to
manually clear bitset that tells which dynamic state is set, in order to
make sure that future dynamic state will be emitted. The issue is that
framework remembers only a past REAL dynamic state and compares a new
dynamic state against it, and not against our static state masquaraded
as dynamic.
Example:
- Set dynamic state S with value A
- Bind pipeline with dynamic state S
- Draw
- Bind pipeline with static state S with value B
- Draw
- Set dynamic state S with value A
- Bind pipeline with dynamic state S
- Draw
Previously, at the last draw the dynamic state S was not dirty and
current dynamic state was equal to the past dynamic state, so
it was not emitted, while GPU used value B from static pipeline.
This fix, at the point of static pipeline binding, clears the
bitset which tells that dynamic state S was previously set.
This forces the next dynamic state to be re-emitted.
Fixes broken rendering in Arma 3, and probably some other
games running through DXVK.
Fixes: 97da0a7734
("tu: Rewrite to use common Vulkan dynamic state")
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27961>
To avoid ping-ponging between 3D & GPGPU in the following sequence :
vkCmdDispatch(...)
vkCmdCopyBuffer(...)
vkCmdDispatch(...)
We can try to keep the pipeline in GPGPU mode when doing blorp buffer
operations (we have blorp support for the CCS and can use the same
shaders on RCS).
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27956>
There are multiple problems currently :
- blorp blitter commands overwrite the protection value coming from
the driver
- anv & iris are using render target MOCS for compute commands
Driver already have the ability to pass the MOCS values so we choose
to stick to that in this change. But now the driver need to select the
right MOCS depending on the engine the commands are going to run onto.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27956>
We need to use the usage parameter.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 64f20cec28 ("anv: prepare image/buffer views for non indirect descriptors")
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27956>
An increase by factor 10 with each re-allocation is a bit aggressive and
we hit the available limit easily on lavapipe.
By starting of with an initial larger scale, but decreasing this over time
this error can be avoided.
Specifically with
"spec@arb_shader_texture_lod@execution@tex-miplevel-selection *gradarb 1d"
originally the buffer sizes would be 250, 2500, 25000, and 250000,
with the patch it's 250, 4000, and 32000.
v2: use minimum scale of 4 instead of 2 (Mike)
v3: fix typo (Mike)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27977>
ZINK_BIND_RESOURCE_DESCRIPTOR and ZINK_BIND_SAMPLER_DESCRIPTOR are
always used together, so that we can replace these two values with
ZINK_BIND_DESCRIPTOR and use only one bit to represent the value.
With that we can also remove the aliasing of ZINK_BIND_DESCRIPTOR with
PIPE_BIND_CONST_BW.
Fixes: 13c6ad0038
zink: use a single descriptor buffer for all non-bindless types
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28016>
These don't seem useful, since they're already done in the early optimizations.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27335>
We allocate them at the end of the register file and keep them separate
from normal VGPRs. This is for two reasons:
- Because we only ever move linear VGPRs into an empty space or a space
previously occupied by a linear one, we never have to swap a normal VGPR
and a linear one. This simplifies copy lowering.
- As linear VGPR's live ranges only start and end on top-level blocks, we
never have to move a linear VGPR in control flow.
fossil-db (navi31):
Totals from 5493 (6.93% of 79242) affected shaders:
MaxWaves: 150365 -> 150343 (-0.01%)
Instrs: 7974740 -> 7976073 (+0.02%); split: -0.06%, +0.08%
CodeSize: 41296024 -> 41299024 (+0.01%); split: -0.06%, +0.06%
VGPRs: 283192 -> 329560 (+16.37%)
Latency: 64267936 -> 64268414 (+0.00%); split: -0.17%, +0.17%
InvThroughput: 10954037 -> 10951735 (-0.02%); split: -0.09%, +0.07%
VClause: 132792 -> 132956 (+0.12%); split: -0.06%, +0.18%
SClause: 223854 -> 223841 (-0.01%); split: -0.01%, +0.01%
Copies: 559574 -> 561395 (+0.33%); split: -0.24%, +0.56%
Branches: 179630 -> 179636 (+0.00%); split: -0.02%, +0.02%
VALU: 4572683 -> 4574487 (+0.04%); split: -0.03%, +0.07%
SALU: 772076 -> 772111 (+0.00%); split: -0.01%, +0.01%
VOPD: 1095 -> 1099 (+0.37%); split: +0.73%, -0.37%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27697>
This is almost a direct copy+paste into it's own function.
This is useful both for future work and the make the caller smaller.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27697>
This is already the case, and requiring it will be useful in the future.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27697>
Currently the command buffer is completely empty, which is not good.
There are a few of things that should be programmed, but we've
probably been okay due to the default engine initialization.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: edcde0679c ("anv: Add helper to create companion RCS command buffer")
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27979>
The R400 has working hardware opcodes for cos and sin at
the vertex shader level. This change enables these features.
This change was tested on an ATI R430 (0x554d).
Here is the shader-db summary:
total instructions in shared programs: 103863 -> 103552 (-0.30%)
instructions in affected programs: 5610 -> 5299 (-5.54%)
helped: 38
HURT: 24
total temps in shared programs: 16836 -> 16830 (-0.04%)
temps in affected programs: 42 -> 36 (-14.29%)
helped: 6
HURT: 0
total cycles in shared programs: 162448 -> 162139 (-0.19%)
cycles in affected programs: 5760 -> 5451 (-5.36%)
helped: 38
HURT: 24
LOST: 0
GAINED: 3
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10504
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27970>
In order to remove the ambiguity because for task shaders the driver
needs to emit to both the GFX CS and the ACE CS but all states come
from the main cmdbuf (ie. GFX) from the application point of view.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27819>
This workaround will be removed soon but in order to pass only
radv_cmd_state+cs+ace_cs to the functions that draw with mesh+task, the
32-bit value needs to be allocated earlier.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27819>