Otherwise wide lines break. The alternative approach is to eliminate the points
writes when not drawing points since we do have topology information at compile
time. I'm admittedly stuck in my GL mindset. That's the approach we'll need for
Valhall anyway.
Fixes dEQP-VK.rasterization.interpolation.basic.lines_wide
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16204>
The VAR_TEX definition in ISA.xml only has a field for texture_index,
so trying to read sampler_index will return zero; read from
texture_index instead, and rename other fields for consistency.
The texture and sampler indices must be equal for VAR_TEX to be used,
so either name could be used for the field.
Fixes the wrong textures being used in Thief.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6219
Fixes: eb1479bda2 ("pan/bi: Support message preloading")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16255>
While using three component texture formats results in CTs failures,
three component vertex attributes are fine, and not allowing them
results in significant performance regressisons.
Fixes: e41958e344
r600: Disable eight bit three channel formats
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6399
v2: rename function to is_buffer_format_supported (Emma)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16267>
Discrete platforms don't have LLC, but on those, we mmap our buffers
with WC. So we shouldn't need to clflush there.
Anv already had a boolean field on the physical device to know whether
we need to use clflush(), based off the memory heaps available. So use
that instead.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15780>
981bd8cbe2 moved outputs removing handling to NIR, but instead of
applying it only to the last stage before the FS this now applies
it to both the GS and the VS.
This commit fixes this by clearing the kill_outputs field for
the VS when using a ES-GS shader.
Fixes: 981bd8cbe2 ("radeonsi: apply key.ge.opt.kill_{outputs,pointsize,clipdistance} in NIR")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16249>
Layout transitions are not relevant to us, we only care about barriers
that involve a sync point between read/write actions on the image across
GPU jobs.
Image transitions from undefined layout can only happen before the image
is ever used by the GPU, which means they are never relevant to our
implementation.
This improves performance in vkQuake.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16235>
The current DCE pass hits issue around phi nodes. These need to be
solved properly eventually, but for now workaround them by doing
something obviously correct (but suboptimal compile time).
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16268>
We need to insert parallel copies at the logical end of blocks, before branches.
Add a pseudo instruction signaling that. Cribbed from ACO.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16268>
Lifted from ir3. Algorithm is the same; the data structures and interface are
lightly modified to decouple from ir3's IR.
Sequentializing parallel copies after RA is tricky. ir3's implementation works
well enough, so I use that one.
Original implementation by Connor Abbott.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16268>
Rather than using builder magic (implicitly lowered on emit), add actual pseudo
operations (explicitly lowered before encoding). In theory this is slower, I
doubt it matters. This makes the instruction aliases first-class for IR prining
and machine inspection, which will make optimization passes easier to write.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16268>
Lifted from Bifrost. Add some basic optimizer tests (they pass!) to show the
compiler is ready to be unit tested. Given we can't have hardware CI for Asahi
yet -- and dEQP is still pretty janky -- unit testing should prove quite useful.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16268>