We'd eventually like to use an ISL helper that doesn't support
DRM_FORMAT_MOD_INVALID. Prepare for this by replacing the invalid value
with the modifier associated with the BO's tiling in more cases.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24120>
Partial renders exist to the spill the tilebuffer to memory, there's nothing to
do if it's already spilled (and would just waste memory bandwidth and create a
feedback loop).
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Bizarrely, the clamps/wrap modes are respected so we need to set them
appropriately for correct out-of-bounds behaviour (returning all zero). That in
turn means we can't use whatever sampler is already there, instead we need to
allocate a dedicated sampler just for txf. Good news is we have an extra sampler
state register available for the purpose.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
We need to support arrayed images and sRGB images, which are hardware. For
atomics, we need to pack the augmented software data structure. Finally, we need
to support buffer images. Like their texture counterparts, these get lowered to
2D images.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
We now emulate an infinitely large binding table with bindless, so the sky is
the limit for this CAP. Note we still have the limit for samplers, so this
probably doesn't do anything for OpenGL.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
For sRGB render targets, we encode sRGB when writing pixels into the tilebuffer
(in the fragment shader), not when writing out the image. When we actually write
out the tilebuffer to the image, we don't use the PBE's sRGB conversion, we just
bind it as a UNORM 8 image and blit the pre-transformed pixels.
We're about to add real sRGB support for the PBE, so make this linearization
explicit.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Both textures and images share a unified indexing scheme in AGX. When binding
tables are used, they can be mapped to texture state registers. Otherwise, there
is bindless access available.
It would be nice to map OpenGL's binding table based textures and images to AGX
texture state registers 1:1. The problem is that OpenGL allows more combined
textures and images than we necessarily have texture state registers. So, we use
as many texture state registers as we can, and then we fallback on an internal
bindless scheme mapping an extended binding table.
Add and use a lowering pass to map all of the API-level texture/image indices to
either texture state registers or bindless handles as required.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
We need to set a particular bit on atomics for them to be coherent across
clusters. Fixes atomics on G13X.
Setting this bit on the single-cluster G13G, on the other hand, wedges the GPU.
So best be careful ;-)
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Fixes KHR-GLES31.core.draw_indirect.advanced-twoPass-transformFeedback-arrays
and KHR-GLES31.core.draw_indirect.advanced-twoPass-transformFeedback-elements
on M1 Ultra (G13D). Let's assume that same bits are required on M1 Pro
and Max.
Signed-off-by: Janne Grunau <j@jannau.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Deserializing isn't expected to be much more expensive than cloning, and the
serialized NIR is *significantly* smaller. So store the serialized instead of
the deserialized, and deserialize on the fly.
This reduces a lot of noise in valgrind due to random crap alloc'd against the
NIR shader by lowering passes that now get properly freed.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
This lets us force small tiles when they otherwise would not be
necessary, which is useful for decoupling tile size and the logic that
depends on it from things like MSAA and MRT which can trigger small
tiles.
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
This requests synchronous TVB growth (instead of split renders). Mostly
for testing at this point.
Only works with newer kernels and the kernel will complain on dmesg for
now.
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
In general, PBE descriptors map pipe_image_views for the hardware. That we use a
writeable shader image internally for render targets is an implementation-detail
of the end-of-tile program. So, refactor the PBE upload routine to take a
pipe_image_view (not a pipe_surface), and translate the pipe_surface into an
internal pipe_image_view for end-of-tile programs.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Save lowest dword of shader source sha1 in pipeline object for use
later as hash for uniquely identifying shader in debug outputs.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23942>
Use a struct for various common parameters rather than per stage
structure or arguments to stage specific entrypoints.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23942>
We skip ntt regalloc for vertex shaders and we have 1024 instruction
limit for R500 vs, so in theory we could run some shaders with more that
1024 ssa registers (if we can optimize the number of instruction in the
backend). So add one more bit.
Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24154>
We might need more push uniforms (FAU) than the currently bound program. Update
that too for correct results on v9.
Fixes: c282f80c98 ("panfrost: Fix transform feedback on v9")
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24198>
Without SP_FS_CTRL_REG0.LODPIXMASK quad ops don't get values from
helper invocations, but from the current one.
Fixes:
dEQP-VK.glsl.derivate.dfdxsubgroup.*
dEQP-VK.glsl.derivate.dfdysubgroup.*
Cc: mesa-stable
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24211>
That's the "real" name of the field.
It enables ALL helper invocations in a quad, which is necessary for
fine derivatives and quad subgroup ops.
While PIXLODENABLE by itself enables only 3 out 4 fragments in a quad.
Cc: mesa-stable
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24211>