I see no reason to hide this. The small hit in cycle count is offset in
practice by the increase in thread count. So let's ship it and get some
testing.
If this regresses a workload:
1. Open an issue on the tracker and attach an apitrace.
2. In the meantime set PAN_MESA_DEBUG=nofp16 to override.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5960>
EXT_shader_framebuffer_fetch requires the fetched value to be per-sample, so we
need to load the sample id when in a fragment shader.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5930>
There are a lot of them and they are mostly uninteresting, so don't
disassemble them or print shader-db results.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5948>
Opening the dump file in pandecode_jc instead of doing it in
pandecode_next_frame avoids creating zero sized files when
applications exit.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5931>
We're overreporting on some chips and underreporting on others. Let's be
more honest.
This exposes OpenGL ES 3.0 on Mali T760 through T860.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5858>
Requires the ability to texture the stencil-only portion, and then
u_blitter kicks in for the rest.
v2: Fix dEQP-GLES31.functional.stencil_texturing.*
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5858>
v2: Be more explicit about sampler types. Prefer the term "load" to
"resolve" to match VK convention. Generate shaders for MRT 8x. Blit
shader generation adds about 6ms to startup cost. We could cache thes.
shaders to disk if we needed to (or indeed, ship binaries).
v3: Fallback on u_blitter on Bifrost so Bifrost continues to work.
KHR_partial_update support is mostly no-oped on Bifrost now, but that's
okay for now - compositors are still functional.
v4: Specialize on multisample state as well to enable reloads of MSAA
textures. This requires 2x the shader variants, so I assume we're up to
12ms startup cost for generation. Annoying. Also fix interactions with
depth- or stencil-only clears of combined depth-stencil surfaces.
v5: Cache to the device (screen) instead of the context, reducing
duplicated work in apps that create many contexts (e.g. Chromium)
v6: Squash in KHR_partial_update cleanup to fix intermediate
regressions on a few tests.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5824>
We write to r2, which is preseved through to the blend shader, from
where it is read. We won't worry about MRT to keep things simple.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5620>
Real writeout stores, which break execution, need to be moved to after
dual-source stores, which are just standard register writes.
v2: Don't move stores forward, to avoid moving them to before where
their source is written.
v3: Only reorder past dual-source stores.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5620>
Fixes error when the assert is optimized out:
../src/panfrost/midgard/midgard_compile.c: In function ‘output_load_rt_addr’:
../src/panfrost/midgard/midgard_compile.c:1644:1: error: control reaches end of non-void function [-Werror=return-type]
}
Closes#3270
Fixes: 7781d2c2ea ("pan/mdg: Support MRT in output load lowering")
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5895>
Mali (and Vulkan) uses D3D naming conventions for these formats where
Gallium/Mesa uses OpenGL names, but the formats are equivalent. sRGB is
communicated out-of-band on Mali; otherwise, it appears to be a 1:1
mapping.
On supported devices, this exposes GL_EXT_texture_compression_rgtc and
GL_ARB_texture_compression_bptc, so update features.txt
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5856>
Ancient relic from kbase. Panfrost kernel doesn't need this.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5859>
Always checked together and really signal the same property from
different perspectives.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5859>
Now we have a central store of them, so we may remove active_bo.
v2: Squash two patches together to prevent a race condition mid-series.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5859>
No clue how this worked before.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Fixes: 82f18b713a ("panfrost: Keep track of active BOs")
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5859>
Otherwise we end up implicitly converting ints to floating point.
Likewise for floats which again have strange interactions.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5755>
This allows the conversion to be removed when the output is needed as
f32 anyway, for example for highp framebuffer fetch.
v2: Also change operations such as i2i16 to i2imp (Alyssa).
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5755>
Native loads are the same for any format, so we can use the same
shader variant for all framebuffer formats with a native load.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5755>
When EXT_shader_framebuffer_fetch is used and only depth and/or
stencil are written, we can't rely on the first output being to
depth/stencil.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5755>
The pass is useful for EXT_shader_framebuffer_fetch, not just blend
shaders, so we should do it with the other lowering passes in
midgard_compile_shader_nir.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5755>
We can treat it as RGBA32UI and swizzle away everything but R, like the
blob does. Maybe not the most efficient thing in the world.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5827>
Removes random global state flying about which doesn't really work for
common code. We cleanup some debug messages while we're at it because
the mostly-unused DBG macro relies on magic state.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5794>