The 1-bit versions get already lowered by nir_lower_alu_width
before pan_nir_lower_bool_to_bitsize() is being called.
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41472>
Make r300_get_endian_swap return NO_SWAP directly in little endian. This
keeps the depth/stencil raw alias fix and the BE render-to-texture
exception intact while avoiding the mixed ifdef/runtime endian checks.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42218>
Tiled depth/stencil transfers can raw-copy 32-bit ZS storage through an
RGBA8 color alias. On big endian, the normal RGBA8 array policy programs
no swap, while the underlying ZS storage uses the dword endian
convention. Keep those raw aliases on dword swap so depth readback sees
the same byte order that the ZS path wrote.
Fixes spec@arb_depth_texture@depthstencil-render-miplevels 146 d=z24.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42218>
On big endian, several colorbuffer formats consume the constant
blend-color register lanes in a format-specific order. Program the inverse
lane order for A8R8G8B8, RGBA8/RGBX8/RGB10A2, RGB565, and B5G5R5* so
GL_CONSTANT_COLOR and GL_CONSTANT_ALPHA see pipe RGBA/RGB.
This fixes the constant blend-factor dEQP GLES2 fragment_ops blend cases
on BE RV370 and the affected 8888, 1010102, 565, and 1555 subtests in
Piglit fbo-blending-formats.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42218>
On big endian, the r300 colorbuffer path needs RGB lane order for
RGB565 and for the A1B5G5R5/X1B5G5R5 1555 render aliases. The generic
BGRA lane mapping swaps red and blue for these render targets.
For B5G5R5* textures, override the generic W1Z5Y5X5 sampler swizzle so
GL_RGB5 and GL_RGB5_A1 render-to-texture sample with the same convention
that RB3D used when writing the colorbuffer. This fixes the Sauerbraten
minimap trace.
RGB565 also needs a transfer-boundary conversion: resources are stored in
the hardware lane order, while CPU maps must expose Gallium's
PIPE_FORMAT_B5G6R5_UNORM convention. Swap the 5-bit red/blue fields on
BE RGB565 maps so uploads and readbacks remain CPU-visible B5G6R5.
Assisted-by: Codex (GPT-5.5)
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42218>
On big endian, r300 normally uses DWORD_SWAP for 8888 array formats so
CPU-visible bytes match the Gallium component order.
That convention is wrong for non-sRGB 8888 resources that are both render
targets and sampler views. Glamor can render to such a BO in VRAM and
later sample it after the kernel migrates it to GTT under memory pressure.
The bytes survive the move, but the sampler observes the colorbuffer
contents with the opposite component convention, causing inverted colors
around moved windows.
Use NO_SWAP for both colorbuffer and sampler state for these resources,
while leaving transfer resources, sRGB, pure render targets, and
upload/readback paths on the existing endian policy.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15398
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42218>
ZPASS counter writes follow the programmed ZB endian mode. The normal
depth-buffer path already sets the BE depth endian bits, but the
dummy-Z path used for occlusion queries without a depth buffer did
not. Set the dummy-Z pitch endian bits on BE as well, then read query
counters in native CPU order.
This fixes byte-swapped occlusion query results from meta draw
operations on big-endian r300 while preserving the no-depth dummy-Z
query path.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42218>
The BE VAP path rejects vertex attribute formats smaller than 32 bits,
but the zero-vertex-elements fallback used R8G8B8A8_UNORM as its dummy
PSC attribute. This made no-attribute draw tests abort in r300_vertex_psc.
Use R32_FLOAT for the dummy attribute instead, it keeps the same 4-byte
dummy stride while satisfying the BE format restriction.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42218>
We either need to do 32bit endian swap for all in VAP or no swap at all and convert everything to LE manually. This implements the former and mirrors the current behaviour for vertex attributes.
Assisted-by: Codex (GPT-5.5)
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42218>
The index offset is derived from index_bias, which is signed and can be
negative when r300_split_index_bias() has to emulate a negative bias on
pre-R500 hardware. Keep the translation helper parameter signed instead
of converting it to unsigned at the function boundary.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42218>
This is useful for patching sysvals in an array of push uniforms
cpu-side.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41654>
For passthrough GS, we need to compile a full panvk_shader with all of
the GS variants. I don't love increasing the number of shader
compilation entrypoints that we have floating around, but I think this
is preferable to exposing panvk_compile/preprocess_shader directly.
Another option would be to have a panvk_meta_shader helper, like how
hk_meta_shader works. This gets messy because for consistency we would
also want a panvk_internal_meta_shader helper, and slotting that into
the existing create_internal_shader uses is difficult due to variation
in how shaders are created and how pan_compile_inputs is set up. We
could do it, but it would be messy.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41654>
This allows us to simplify synchronization in GS dispatch a bit, and is
necessary for multidraw GS because we can't use SYNC_ADD-style
synchronization inside a loop. This is because relative_sync_point must
be advanced by a statically known value.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41654>
Co-authored-by: Olivia Lee <olivia.lee@collabora.com>
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41654>
It now walks over zero variants if the shader is NULL and there is a
const version for using in state tracking code when we want to walk over
all the variants of a const panvk_shader.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41654>
We can benefit from just constructing the push uniforms without the
function knowing anything about where to store the gpu address for it.
Co-authored-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41654>
This means asking the CSF to walk the indirect draw buffer twice in the
case of non-GS indirect draws but the cost there probably isn't high.
For draws with geometry shaders, we'll want to patch the VS descriptors
and launch the compute VS as a multi-dispatch, iterating over the
patched descriptors similarly to the weay multi-draw does today. The
final draw might not actually be a multi-draw, depending on details of
the geometry pipeline. This means we need to separate attribute
patching from drawing.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Co-authored-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41654>
Before we do anything else in the draw, particularly anything involving
geomtry, we want to prepare all our descriptors. The geometry pipeline
will use the same descriptors regardless of HW vs. SW vertex shading and
regardless of which geometry shaders are used. Preparing things
up-front makes it easier to flex and emit the necessary draws
afterwards.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41654>
This and the previous commit are the beginning of an overall restructure
of the state flushing flow. The idea is that the very first thing
flushed are descriptors, including vertex and fragment attributes.
Importantly, flushing descriptors does not depend on which vertex shader
variant we end up picking in the end. Both the HW and SW VS will use
the same descriptors.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41654>
This is similar to the previous commit except it's not moving nearly as
much around. The biggest change here is that we move it out of the
dirty check section so that we can assert that VS_PUSH_UNIFORMS is
always dirty for indirect draws.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41654>
There's an issue with indirect draws where we do descriptor patching on
the command stream if the indirect baseInstance is non-zero. In order
for this to work, we need to ensure we have N fresh attribute tables and
N fresh resource tables to reference them. We tried to handle this by
smashing MESA_VK_DYNAMIC_VI to dirty whenever we might hit this case.
However, we originally placed this check before se actually computed
vi.attribs_changing_on_base_instance so the check was wrong if the
previous VI set didn't have VK_VERTEX_INPUT_RATE_INSTANCE anywhere.
Fortunately, if we ever hit that case and it mattered, VI would be dirty
so would re-emit anyway.
This commit adds a new BASE_INSTANCE driver dirty flag and sets that
whenever the base instance changes or we're an indirect draw. Then we
can add some helpers which do some sanity assertion checks and make the
correctness far more clear overall. This also lets us reduce some of
the calculation duplication to prevent things from getting out-of sync
in the future.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41654>
With geometry shaders, we're going to be changing the IA state as we
handle the geometry pipeline and it'll make things much cleaner if we
make it part of the draw info.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Co-authored-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41654>
With geometry shaders, we're going to be changing the IA state as we
handle the geometry pipeline and it'll make things much cleaner if we
make it part of the draw info. To keep things consistent, we refactor
both halves of the driver. JM is simpler so we start there.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41654>
We already have a default case, there is no need to specify each invalid
primitive topology explicitly.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41654>
Geometry shaders are going to need to be able to replace direct draws
with indexed draws and indexed draws with indexed draws with a different
index buffer. In order to facilitate this, stop trusting in the API
level dirty bits and instead plumb it through the draw info and track
the hardware state separately on CSF.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41654>
This unifies the setup code between direct and indirect draws a bit.
There's now one panvk_cmd_draw() helper which calls into the prepare*()
helpers and then calls a direct or indirect launch_*draw() helper. This
new helper takes a copy of the panvk_draw_info, not pointer, because it
may modify it as part of executing the polygon pipeline.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41654>
Once we start actually compiling variants, it's going to be much more
convenient if we only have one descriptor table per logical shader. All
the variants can fetch from the one set of tables. We can't duplicate
push, however, because that depends on the behavior of the back-end
compiler and might be different per-variant.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41654>
There are some important ordering factors here but there's no reason why
these can't be done closer together or why we can't group FS and VS
together. Also, now that we know we check fs != NULL before calling
either desc/SSBO prepare helper, we can drop the shader checks from
them.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41654>
This breaks most of the loop out into a new prepare_draw_layer() helper
which does all the allocations and state management, leaving the caller
to execute the actual jobs.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41654>