Our staging resources need to be LINEAR, however we don't support LINEAR with
DEPTH_STENCIL. The APIs don't actually require this, we just need to make sure
we don't generate internal staging blits to linear depth/stencil resources. For
uploading to compressed depth/stencil textures, we could use a depth/stencil
staging (since we can read from linear depth/stencil). However, for downloading
from compressed depth/stencil, we can't use a depth/stencil staging (since we
can't write linear depth/stencil). So, to handle both cases in a unified way,
just use colour blits for depth/stencil resources, using a compatible colour
format. This wouldn't be ok for an application to do itself, but within the
driver we know that it's safe, since there's no difference in memory between
depth/stencil and colour on AGX. In particular, Z16 is compressed exactly the
same as R16, Z8 as R8, and so on.
Fixes depth/stencil compression.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>
Otherwise framebuffer->width ends up being wrong with u_blitter, this is what
other drivers do. If we needed to render to depth/stencil with u_blitter, this
would cause us trouble.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>
We go to initialize the disk cache before we've compiled any shaders so
agx_compiler_debug is 0 at this point. Don't try to read it, instead go through
sa safe getter that will do the right thing.
Fixes: 5e9538c12e ("agx: isolate compiler debug flags")
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>
When an empty batch is submitted, nothing happens. However, this batch
may have taken over writer status for some BOs which still have a
pending submitted batch that hasn't finished yet. If we drop writer
status at this point, two bad things happen:
- We spuriously clear bo->writer_syncobj, which breaks syncing on
post-facto BO exports
- We break agx_sync_writer(), since we no longer know about the old
writer to properly block on it.
To fix this (hopefully rare) case, take advantage of bo->writer_syncobj
to find the currently submitted writer batch again, and revert the
writer to it. If this turns out to be common and a performance issue
iterating through submitted batches for each written BO, we could
implement it with two writer batch arrays instead, one for active
writers and one for submitted writers... but hopefully that isn't
necessary.
This splits the cleanup path in agx_batch_cleanup() depending on whether
the cleanup is for a reset or proper completion. Since this is only used
within agx_batch.c, drop the public prototype while we're at it.
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>
These need the shader_base added to them. Fixes GEM_BIND errors after
usc_head provides VA without the VM_SHADER_START offset from returned
low VA.
Signed-off-by: Janne Grunau <j@jannau.net>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>
The typo is in the !__GLIBC__ case and was observed while building on
Alpine.
Fixes: 0a132b0640 ("asahi: Add a helper macro for debug/error messages")
Reported-by: mps
Tested-by: mps
Signed-off-by: Janne Grunau <j@jannau.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>
This doesn't enable support for a7xx yet, but uses the new register pack
builders for registers that differ between a7xx and a6xx.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22837>
../src/microsoft/vulkan/dzn_image.c: In function ‘dzn_GetImageMemoryRequirements2’:
../src/microsoft/vulkan/dzn_image.c:918:91: error: passing argument 6 of ‘dzn_ID3D12Device12_GetResourceAllocationInfo3’ from incompatible pointer type [-Werror=incompatible-pointer-types]
918 | &image->castable_format_count, &image->castable_formats,
| ^~~~~~~~~~~~~~~~~~~~~~~~
| |
| DXGI_FORMAT **
In file included from ../src/microsoft/vulkan/dzn_private.h:67,
from ../src/microsoft/vulkan/dzn_image.c:24:
../src/microsoft/vulkan/dzn_abi_helper.h:64:107: note: expected ‘const DXGI_FORMAT * const*’ but argument is of type ‘DXGI_FORMAT **’
64 | const UINT *num_castable_formats, const DXGI_FORMAT *const *castable_formats,
| ~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
cc1: some warnings being treated as errors
ninja: build stopped: subcommand failed.
Fixes: 71dbb3120a ("dzn: Use GetResourceAllocationInfo3 for castable formats")
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22877>
reason:
in some cases, bs buffer size could cause assertion,
and some bitstreams of certain resolutions could
not be decoded.
solution:
to align the bs buffer to 128.
fixes: 4f1646d73f
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22844>
We're about to manipulate these pools and dealing with the fix address
ranges is painful.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22847>
We want to add more heaps in the future and so not having to do
address checks to find out in what heap to release a BO is convinient.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22847>
All the supported platforms should have 36+ bits of virtual address
space.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22847>
These asserts were checking isl_format_layout against itself, change
to compare surface format layout against view format layout.
Fixes: 628bfaf1c6 ("intel/isl: Add some sanity checks for compressed surfaces")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22790>
With the following test :
dEQP-VK.spirv_assembly.instruction.terminate_invocation.terminate.no_out_of_bounds_load
There is a :
shader_start:
... <- no control flow
g0 = some_alu
g1 = fbl
g2 = broadcast g3, g1
g4 = get_buffer_size g2
... <- no control flow
halt <- on some lanes
g5 = send <surface>, g4
eliminate_find_live_channel will remove the fbl/broadcast because it
assumes lane0 is active at get_buffer_size :
shader_start:
... <- no control flow
g0 = some_alu
g4 = get_buffer_size g0
... <- no control flow
halt <- on some lanes
g5 = send <surface>, g4
But then the instruction scheduler will move the get_buffer_size after
the halt :
shader_start:
... <- no control flow
halt <- on some lanes
g0 = some_alu
g4 = get_buffer_size g0
g5 = send <surface>, g4
get_buffer_size pulls the surface index from lane0 in g0 which could
have been turned off by the halt and we end up accessing an invalid
surface handle.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20765>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22690>
There is no need to use alloc_vertices_and_primitives anymore,
because it will be compiled to sendmsg anyway.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22690>
Legacy GS needs to emit a DONE signal at the end. Do this in NIR
instead of in the backends.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22690>
Remove the GS intrinsics completely and emit the sendmsg here
instead of in the backend. This is done to simplify backend code.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22690>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22690>
Will be used by ac/nir legacy and NGG lowerings.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22690>
Contains a global wave ID of legacy GS waves.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22690>
This intrinsic is going to be used for simplifying GS code.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22690>
v2 by Timur Kristóf:
Do not add the affinity for instructions that can't write m0
reliably, such as readlane-like instructions on GFX8.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22690>
This value takes a few instructions to create, involving expanding
V-immediates, adding 8 for SIMD16, and so on. We can mark it UNDEF
so that it's clear that although these are partial writes, we are
actually defining the entire value.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22835>
Comparisons which produce 32-bit boolean results (0 or 0xFFFFFFFF)
but operate on 16-bit types would first generate a CMP instruction
with W or HF types, before expanding it out. This CMP is a partial
write, which leads us to think the register may contain some prior
contents still. When placed in a loop, this causes its live range
to extend beyond its real life time.
Mark the register with UNDEF first so that we know that no prior
contents exist and need to be preserved.
This affects:
flt32, fge32, feq32, fneu32, ilt32, ult32, ige32, uge32, ieq32, ine32
On one of Cyberpunk 2077's most complex compute shaders, this reduces
the maximum live registers from 696 to 537 (22.8%). Together with the
next patch, Cyberpunk's spills and fills are cut by 10.23% and 9.19%,
respectively.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22835>
While we're here, move it to after supported extensions to stay
consistent with the vk_physical_device_init parameters.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Constantine Shablya <constantine.shablya@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22845>
It is undefined behavior when an ARB assembly or shadow2d GLSL func
uses SHADOW2D target with a texture in not depth format.
In this case AMD and NVIDIA automatically replaces SHADOW sampler
with a normal sampler and some games like Penumbra Overture which abuses
this UB works fine but breaks with mesa.
Replace the shadow sampler with a normal one here by recompiling
the ARB program variant
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8425
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Illia Polishchuk <illia.a.polishchuk@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22147>
From OpenGL spec 8.6
"An INVALID_OPERATION error is generated if the object bound to
READ_FRAMEBUFFER_BINDING is framebuffer complete and its effective
value of SAMPLE_BUFFERS (see section 9.2.3.1) is one"
But some games might do this
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8425
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Illia Polishchuk <illia.a.polishchuk@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22147>