I thought I removed those, it seems my rebase got screwed up :(
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 64f20cec28 ("anv: prepare image/buffer views for non indirect descriptors")
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23317>
The GS and the FS needs to agree on the driver_location. But we just
used the num_outputs variable for the GS instead of matching the logic
from lower_aaline_instr in nir_draw_helpers.c.
This does that, but cleans up our copy slightly to avoid computing the
needless location, as well as using unsigned values.
This used to *mostly* work before, but only because we were lucky and
not too much crazy stuff went on with the inputs / outputs in
smooth-line cases.
Fixes: edecb66b01 ("nir: avoid generating conflicting output variables")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23316>
Since 2d6233d0 ("nir: Check all sizes in nir_alu_instr_is_comparison"),
nir_alu_instr_is_comparison already returns true for comparisons with 32bit
result.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23287>
For:
v_mov_b32_e32 v0, 1.0
exp mrtz v0, off, off, off
we should completely remove the ALU entry before creating the EXP's WaR entry for v0.
Otherwise, the two will be combined into an entry which will wait for
expcnt(0) for later uses of v0.
gen_alu() should also be before gen(), since gen_alu() performs the clear
while gen() creates the WaR entry.
fossil-db (gfx1100):
Totals from 3589 (2.69% of 133428) affected shaders:
Instrs: 5591041 -> 5589047 (-0.04%); split: -0.04%, +0.00%
CodeSize: 28580840 -> 28572864 (-0.03%); split: -0.03%, +0.00%
Latency: 65427923 -> 65427543 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 11109079 -> 11109065 (-0.00%); split: -0.00%, +0.00%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23213>
This does not fix the tests completely, but does allow them to run to
completion and fail "properly".
Also contains a few trivial bugfixes in the same codepath.
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
Reported-by: James Glanville <james.glanville@imgtec.com>
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23100>
This commit adds handling for {s,z}loaden and {s,z}storeen to
control loading from and storing to the stencil and depth buffer.
This commit also addressed the FIXMEs around barrier_{load,store}
which control the {s,z}{load,store}en.
Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20487>
This allows sub_cmd->job.run_frag to be setup before calling
pvr_sub_cmd_gfx_requires_split_submit().
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
Reported-by: James Glanville <james.glanville@imgtec.com>
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23125>
Deferred pvr_transfer_cmd instances are allocated from a dyn_array on
the owning pvr_cmd_buffer and must not be freed directly.
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
Reported-by: James Glanville <james.glanville@imgtec.com>
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23125>
These previously COMMAND scoped allocations are stored on the
VkRenderPass and must therefore be OBJECT scoped.
Fixes: dEQP-VK.api.object_management.single_alloc_callbacks.render_pass
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23130>
A recent update to Cyberpunk 2077 enables XeSS code for Intel GPUs
which is causing the game to crash in the XeSS libraries. As a
temporary work around, stop identifying as Intel for Cyberpunk so
XeSS falls back to the cross-vendor path.
References: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8860
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23271>
Not sure we bumped it to 32 for the right reasons. This generates more
push constant data and because we're not tighly packing our push
constant data this can generate more register pressure.
We could tightly pack things at the cost of some CPU cycles but only
for some stages. RT stages would have to retain the current "sparse"
version of push constants.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>
descriptor_stride includes multiple plane size, this new field tracks
just the data of one plane.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>
To make BTI indexing simpler with ycbcr samplers, stop doing packing
calculations in the apply_layout. We'll insert NULL bindings for the
few ycbcr cases where it's needed.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>
When in direct descriptor mode, the descriptor pool buffers will hold
surface states directly. We won't allocate surface states in image &
buffer views.
Instead views will hold a packed RENDER_SURFACE_STATE ready to copied
into the descriptor buffers.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>
Now that descriptor sets are located a in a 1Gb area, we can avoid
storing the whole address to the descriptor and add the base address
of the area to a 32bit offset.
Replay a bunch of fossils with this and changes not really significant
one way or another :
Totals:
Instrs: 9278246 -> 9277148 (-0.01%); split: -0.01%, +0.00%
Cycles: 3547598421 -> 3547579435 (-0.00%); split: -0.00%, +0.00%
Totals from 353 (1.14% of 31021) affected shaders:
Instrs: 581546 -> 580448 (-0.19%); split: -0.23%, +0.04%
Cycles: 25885422 -> 25866436 (-0.07%); split: -0.31%, +0.24%
No difference on send messages or spills/fills.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>
We'll use the fact that the pool is aligned to 4Gb to limit the amount
of address computations to build the address in the shaders.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>
This is the default for now. It needs to be part the pipeline hashing
as we will allow this to be tweaked per application.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>