It already does compaction, so we just need to load vertex positions
and cull. This was easier than expected.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13829>
si_llvm_translate_nir() changes ctx.stage, so the outside code shouldn't
use it. This hasn't caused any issues yet. Since ctx.stage starts as 0,
the first use in this commit was a tautology.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13829>
Unlike Linux dma-bufs, D3D12 resources are strongly typed, and
can't necessarily just reinterpret the memory arbitrarily.
Allow importing resources with no description coming from the frontend,
and populate the resource desc from the driver instead. If there was
a template, make sure that it matches the incoming resource.
Reviewed-by: Bill Kristiansen <billkris@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13054>
The fallback path we averages unorm textures, but if we don't have ubwc on
either then we can just cast them to uint which then just takes sample 0.
The proper UBWC format I think ends up averaging, though.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13867>
GLES doesn't support it, and blob VK doesn't support it. We could
theoretically lower it, but don't bother since it's not required. Fixes
various piglit image load/store tests.
Suggested-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13852>
Fix a bug in BIND_PIPELINE XML reported by Dougall, which cleans up
a bit of both decoder and driver.
Instead of...
* 17 bytes BIND_PIPELINE (17)
* An unused 8 byte record (25)
* A set of N 8 byte records (25 + 8 * N)
* Oops, 1 byte too many! One just disappeared (24 + 8 * N)
It seems to instead be
* 24 bytes BIND_PIPELINE (24)
* A set of N 8 byte records (24 + 8 * N)
without the sentinel record. These means the 8 byte records themselves
are shuffled, with the high byte of the pointers split from the low
word, but that's less gross than an off-by-one.
It's still not clear what the last 8 bytes of the BIND_VERTEX_PIPELINE
structure mean, or the last 4 byte of the BIND_FRAGMENT_PIPELINE
structure which seems to be a bit shorter.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13784>
Dougall Johnson observed these structures make more sense with indices[]
first in the entries and indices[] absent from the header. Then the
sentinel entry disappears, nr_entries makes more sense, and a few magic
numbers pop out. Many thanks to Dougall's astute eyes.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13784>
this forces a full slab reclaim any time the device is known to have a
too-small BAR in order to keep memory usage at a minimum when it might otherwise
balloon out and crash us
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13850>
sometimes a driver might want to always reclaim all bo objects in the course
of allocating a new bo. this is useful when it's known that a given memory
heap is very small and will likely need to keep its usage minimized
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13850>
Some apps may try to use a viewport adjusted by 0.5 pixels (among other
things) to emulate d3d9 pixel center, and in this case we would end up
with incorrect "fake scissor" box (shifted by 1 pixel), hence pixels
being incorrectly scissored away when permit_linear_rasterizer is set
(this happens even if the linear rasterizer is not used in the end).
So adjust the offset so that the half-way points get rounded down instead
of up.
(This is all a bit iffy I think since we don't use fractional
boxes (with 8 subpixel bits) anywhere yet, but at least without msaa
it should work out.)
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13794>
Currently we run deqp-runner inside a single VM, which makes very poor
use of the available CPUs because Virgl has a bottleneck in the VMM that
serializes everything.
With this change, we can run several Crosvm instances in a runner and
make full use of the CPUs. Getting the same coverage with 3 runners
instead of 6.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Corentin Noël <corentin.noel@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12828>
nopt will disable some shader optimizations that slow down test runs for
no gain.
no_quad_lod will disable some speed hacks that can cause inaccurate
results.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Corentin Noël <corentin.noel@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12828>
this reworks PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER into an
enum as PIPE_CAP_TEXTURE_TRANSFER_MODES, enabling drivers to choose
a (sometimes) faster, compute-based download mechanism based on a new
pipe_screen hook
compute pbo download is implemented using shaders with a prolog to convert
the input format to generic rgb float values, then an epilog to convert
to the output value. the prolog and epilog are determined based on a vec4
of packed ubo data which is dynamically updated based on the API usage
currently, the only known limitations are:
* GL_ARB_texture_cube_map_array is broken somehow (and disabled)
* AMD hardware somehow can't do depth readback?
otherwise it should work for every possible case
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11984>
this can be used to query whether a driver expects a given texture
copy to be faster as a compute shader or using cpu/gfx transfers
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11984>
this shouldn't report the budgeted available memory, it should return
the total memory, as that's what this api expects
Fixes: ff4ba3d4a7 ("zink: support PIPE_CAP_QUERY_MEMORY_INFO")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13849>
This fixes:
X Error of failed request: GLXBadWindow
Major opcode of failed request: 152 (GLX)
Minor opcode of failed request: 32 (X_GLXDestroyWindow)
Serial number of failed request: 9667
Current serial number in output stream: 9674
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13611>