We still need to handle uncompressed depth on G13X, but that might never
actually happen in practice.
Signed-off-by: Asahi Lina <lina@asahilina.net>
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19999>
The main buffer is twiddled as before, but there's now also an auxiliary
compression buffer that we need to reserve space for.
With compression, the main buffer is aligned less. The macOS logic seems to be
to align to the page size only if the texture is both 3D and mipmapped, *and*
the layer stride is greater than the page size.
That's gated on compression being enabled. Page alignment seems to be needed for
uncompressed twiddled cube maps.
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19999>
Source engine uses flawed device detection in Linux native OpenGL backend,
which causes it to use bad configurations for Intel devices and thus
not always render correctly. Workaround this by using vendor string that
does not include "Intel" in it.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7725
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19828>
bind_sampler_states doesn't zero [nr_samplers, PIPE_MAX_SAMPLERS) so can get
non-null garbage samplers leading to a use-after-free (segfault derefencing
sampler) or a buffer overflow (writing samplers[] out).
Fixes crashes in Xonotic.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reported-by: dcow
Tested-by: dcow
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19994>
Better occupancy, which is especially important when the background shader
does memory access (for reloads). On my 4K monitor, glmark2 -bdesktop fullscreen
from 95fps to 133fps.
At default settings, glmark2 -bterrain from 63fps to 71fps.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19997>
This patch also includes the infrastructure for dumping sub-buffers in
print_sub_buffer() and new field types for floating and fixed point
decimals.
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18948>
This patch removes all fields dependent on the TEXTURE_WRAP_VARYING,
feature which is not currently supported.
It also removes STATE_PPP_CTRL.trp which is conditional on another
unused feature.
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18948>
This functionality should only need to be enabled when required by
other debug options.
While not used directly in this commit, it lays the groundwork for
dumping information from buffers referenced by other buffers.
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18948>
With PVR_DEBUG=cs, the control stream will be dumped to stderr
immediately prior to every render or compute job submission.
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18948>
When HAVE_VALGRIND is set, vbits of the CPU mapping are stored when
pvr_bo_cpu_unmap() is called. They can be reloaded by calling
pvr_bo_cpu_map_unchanged() instead of pvr_bo_cpu_map(). The vbits are
not loaded by default on every map, since they could easily have been
changed by the device between the unmap/map calls. Only use
pvr_bo_cpu_map_unchanged() when you can safely assume that nothing has
changed in the underlying memory.
When HAVE_VALGRIND is not set, pvr_bo_cpu_map_unchanged() just inlines
to pvr_bo_cpu_map().
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18948>
All enums should be unambiguous, so an error is raised when multiple
enum variants with the same value are encountered. When no enum
variants match the provided value, NULL is returned. This allows the
to-string functions to double as validators.
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18948>
These are (reasonably) fast helpers for computing the number of binary,
decimal or hexadecimal digits required to represent a given non-negative
integer.
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18948>
The bit has moved to FDCC_ENABLE on GFX11.
Found by inspection.
Cc: 22.3 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20005>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19966>
NIR shifts are defined to truncate the shift amount to the number of bits
needed to represent the bit-size of the value shifted. LLVM treats large
shifts as poison. This fix achieves NIR semantics for shifts.
As an example, a|(b << 32), where "a" is 32bits, should produce a|b
according to NIR (because 32&31 == 0).
This caused LLVM to incorrectly optimize "(a >> c) | (b << (32 - c))" to a
u2u32(pack_64_2x32(a, b) >> c) (v_alignbit_b32), when the original NIR
should have returned "a | b" if c==0.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19966>
We only copy two components, we have to use the complete original source,
and we should rewrite the new source from scratch to avoid incorrect
dimension and indirect handling.
Fixes: 036d7172c (virgl: Move double operands to a temp to avoid double-swizzling bugs)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19975>
The command buffer already emits ROP3_COPY if the logic op is disabled.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19843>
This bit can be enabled with various combinations and it looks better
to only emit it from the cmdbuf.
Fixes: 17b9aa92b7 ("radv: add support for dynamic logic op enable")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19843>
The vertex shader depends on the primitive topology.
Fixes: 2cce8500de ("radv: add support for dynamic provoking vertex mode with NGG")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19843>
The depth clamp mode depends on depth clip enable/disable.
Fixes: e48c0fbd8f ("radv: add support for dynamic depth clamp enable")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19843>
Instead of creating all 8 pipeline combinations when we initialize
the device we create the pipelines when we need to use them. This
is probably better because applications are likely to always use
the same flags for the copy command, which means that only one
pipeline may be required.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19981>
These features are no longer required.
Reviewed-by: Soroush Kashani <soroush.kashani@imgtec.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19750>
The OpenGL specifications explicitly calls out these queries as allowing
zero bits, so these features aren't actually required to bump the OpenGL
version.
While we could in theory also enable the corresponding extensions
unconditionally, this risks breaking applications that assume that the
presence of the extensions are sufficient to use meaningfully use them,
like is the case with most other OpenGL extensions.
However, blocking more recent GL versions due to this seems like a bit
of an overreaction. So let's allow new OpenGL versions, but not the
extensions themselves.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Soroush Kashani <soroush.kashani@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19750>
Similar to ARB_occlusion_query / ARB_occlusion_query2, this extension
allows zero bits for the queries, meaning there's no actual hardware
requirements here.
So let's just report zero bits if the driver doesn't support the CAP,
and treat these queries as dummies like we already do for occlusion
queries.
We still don't expose the extension, this is just to make it possible to
allow the core OpenGL functionality without exposing the extension.
Reviewed-by: Soroush Kashani <soroush.kashani@imgtec.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19750>
It's legal in OpenGL to start a query even if the result will have zero
valid bits. It's not enough to just report zero bits, We need to also
prevent calling down into the driver with these invalid queries.
Because ARB_ES3_compatibility adds ANY_SAMPLES_PASSED and
ANY_SAMPLES_PASSED_CONSERVATIVE to the set of queries that support zero
bits, we also need to check for the corresponding indices.
Fixes: 0186e9e1c5 ("mesa: always support occlusion queries")
Reviewed-by: Soroush Kashani <soroush.kashani@imgtec.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19750>
This reverts commit 811f8a1946. As Alpine put it
-- this is causing more problems than it's fixing. Hotfix to revert the
offending commit until a more measured fix can be implemented.
Closes: #7731
Cc: mesa-stable
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reported-by: Jan Palus
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19993>
UBSan complains otherwise:
../src/asahi/compiler/agx_pack.c:701:21: runtime error: left shift of 1 by 31 places cannot be represented in type 'int'
../src/asahi/compiler/agx_pack.c:534:18: runtime error: left shift of 8 by 28 places cannot be represented in type 'int'
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19971>
r5 and r6 are always getting lowered. Will prevent a regression with VBO
lowering on a shader which has stride=0 and hence gets the vertex ID read
optimized out with NIR:
dEQP-GLES2.functional.draw.random.50
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19971>
Don't waste 1MB per batch for descriptors that may be completely unused.
Instead, upload the scissor and depth bias arrays at submit time. This is a
simple solution to a silly problem: we can't grow the scissor/depth bias arrays,
and we don't know how big they will be at draw time. We could...
1. Statically allocate large buffers? Waste lots of memory.
2. Statically allocate small buffers? Forces too much flushing.
3. Dynamically allocate a growable GPU buffer? Requires either reading
back write-combined memory contents, or maintaining a CPU copy in
addition to extra GPU copies, or doing complicated MMU shenanigans.
The first two options are slow and the last is complicated.
Instead, we upload these descriptors to a dynamically allocated CPU-side which
gets copied just once to the GPU at submit-time when the exact size is known,
minimizing wasted memory and copies and avoiding any unnecessary flushing or WC
memory reads.
In addition, this patch makes sure we flush if we would overflow with more than
65535 scissor descriptors in a batch. This is a (minor) bug fix.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19971>
Rather than hanging onto them across batches. This lets us free these BOs if the
number of batches shrinks, which is pretty common if all 32 batches are used
during a loading screen for glGenerateMipmap() and then the in-game portion
drops to 1 or 2 batches only. Now that we have the BO cache wired up, this
should not adversely affect performance.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19971>
Massive performance gains, some fps before/after numbers from glmark2:
[shading] 1486 -> 2391
[refract] 87 -> 127
[terrain] 32 -> 56
...and it's basically for free with enough copy/paste, so thank you to Boris
Brezillon for an excellent Asahi patch, the LRU cache seems to work great on M1
:-p
There are a few minor changes I made from panfrost, notably adjusting the
constants to account for 16KiB pages and switching from pthread_mutex to
simple_mtx to be less weird in Mesa.
For context on the design, the following commits evolved it in Panfrost and
their commit messages may be useful... The logic in this module is the product
of years of mistakes and correcting course :-)
f06809cdca ("panfrost: Evict the BO cache when allocation fails")
77d0498913 ("panfrost: Fix major flaw in BO cache")
ee82f9f07e ("panfrost: Try to evict unused BOs from the cache")
2225383af8 ("panfrost: Make sure the BO is 'ready' when picked from the cache")
9af4aeaaf7 ("panfrost: Don't return imported/exported BOs to the cache")
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19971>
This defeats the point of specifying alignments and of packing allocations
together with the BO cache. We're a real driver now, let's allocate memory like
one.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19971>