This allows to use a driconf to override the API version and expose Vulkan 1.3.
That can be used in conjunction with certain games like for example Brawlhalla
which benefits from some DXVK +2.0 features.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26433>
vl_rbsp_fillbits may fill less than 32 bits if it removes emulation
prevention bytes, but will fill at least 16 bits. We need to call it
twice when reading more than 16 bits.
This fixes parsing H264 SPS packed header in va frontend when emulation
prevention bytes are at position where 32 bit values are read.
Cc: mesa-stable
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26276>
There are rendering issues with FCV on DG2 and Unreal engine 5.1,
patch adds option to disable fcv in drirc.
Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26169>
On Steam Deck, shaders are pre-compiled for better performance (less
stuttering, less CPU usage, etc). But when a compiler fix needs to be
backported, there is currently no way to handle this properly.
This introduces 3 drirc options
radv_override_{graphics,compute,ray_tracing}_shader_version in order to
force the driver to re-compile pipelines when needed. By default, the
shader version is 0 for all pipelines.
When one drirc is set for a specific game, RADV will re-compile all
pipelines only once with the compiler fix included (because the
pipeline key would be different).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26094>
The Talos Principle VR shares the same engine quirk as its non-VR counterpart.
Backport-to: 23.2
Backport-to: 23.3
Reviewed-by: Antonino Maniscalco <antonino.maniscalco@collabora.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26047>
util_fast_urem32 is used in the hot path of hashmap lookups and this
asserts causes noticeable overhead. The correctness of this code should
be well exercised both from testing and mathematical proofs, so gate
this assertion behind #ifdef DEBUG.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14168>
This is based on the d3d12 code, and is mostly a rewrite in C,
these are just some helpers to use for writing h264 and h265
headers for vulkan encode.
Acked-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25874>
An "augmented tree" is a tree with extra data attached which flows from
the leaves to the root. An "interval tree" is a datastructure of
(potentially-overlapping) intervals where, in addition to inserting and
removing intervals, we can quickly lookup all the intervals which
overlap a given interval.
After describing red-black trees, CLRS explains how it's possible to
implement an interval tree using an augmented red-black tree where the
nodes are ordered by interval start and each node also stores the
maximum interval end for its entire subtree.
Implement the interval tree extension described by CLRS. Iterating over
all overlapping intervals is actually an exercise, so we have to solve
the exercise. The recursive solution has been re-written to use the
parent pointers to avoid needing a stack, similarly to rb_tree_first()
and rb_node_next().
For now, we only implement unsigned intervals, but the core algorithms
are all abstracted to allow other types. There's still some boilerplate,
but it's the best that can be done in C.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22071>
MacOS does not have the libdrm libraries so is missing xf86drm.h.
util/libdrm.h already has a collection of stubs for systems that do not support the libraries.
A compile on MacOS will fail with the source that uses newer drm functions and structures.
Update adds in missing items that MacOS code needs to compile and run.
New code is copied from the public repository: https://gitlab.freedesktop.org/mesa/drm/-/blob/main/xf86drm.h
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25992>
Another case of a game assuming XeSS is available since an
Intel ARC GPU is discovered by the game's executable binary.
With this, a warning will appear that GPU is unstable/not supported,
but a warning is preferable over the game crashing.
No other issues observed upon starting & playing the game.
Signed-off-by: Casey Bowman <casey.g.bowman@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25965>
Fixes UBSan error:
src/util/sha1/sha1.c:140:8: runtime error: null pointer passed as argument 2, which is declared to never be null
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25853>
This is used for parsing VA packed headers and those can be without
emulation prevention bytes.
Add emulation_bytes argument to vl_rbsp_init and skip all emulation
bytes handling when set.
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25565>
When the number of draw calls is very large, instead of allocating
large amounts of batch buffer space for the draws, use a ring buffer
and process the draw calls by batches.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8645
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Tested-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25361>
gcc is able to optimize away either the modulo or the logical and. This
makes no difference to gcc.
clang is only able to optimize away the logical and. This allows clang
to generate faster code for BITFIELD_MASK.
As for BITFIELD64_MASK, this also makes no difference to clang except it
fixes a compile error for BITFIELD64_MASK(64):
error: shift count >= width of type [-Werror,-Wshift-count-overflow]
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9989
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25757>
The driver glue doesn't have access to that information in a
centralized place. If you want to generate perfetto iid, you need
access to all names.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Felix DeGrood <felix.j.degrood@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25730>
Useful to figure out what's the tracepoint name you're implementing.
We'll use this in the intel perfetto integration glue to index into an
array of perfetto iid.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Felix DeGrood <felix.j.degrood@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25730>
With this, we pick up new in-tree defaults for driconfig variables
when using meson devenv. This is useful for testing new config
variables.
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21808>
This adds an environment variable that can be used to override the
global drirc serach directory. This can be useful for debugging, and
meson devenv, as used in the following commit.
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21808>
this patch calls the init and finish functions of the vk
runtime astc decoder. initializes emulate_astc flag. sets
up the additional plane to store decoded texture.
v2: fix _tex_dataformat() and _tex_numformat() (Chia-I Wu)
use correct function for bufferToImage (Chia-I Wu)
v3: add radv_is_layout_emulated() (Chia-I Wu)
avoid repeated pattern (Chia-I Wu)
v4: not create all pipelines on_demand (Chia-I Wu)
v5: current code does not support astc hdr (Chia-I Wu)
v6: keep luts in staging buffer only (Chia-I Wu)
v7: use 2DArray for both input and output
v8: document todo to use fp16 (Chia-I Wu)
not required to move meta init anymore (Chia-I Wu)
move astc_emulation_format to vk_texcompress_astc.h (Chia-I Wu)
v9: remove LAYOUT check (Chia-I Wu)
check on iview->vk.view_format
move setting tiled flags for astc (Chia-I Wu)
remove is format emulated check in radv_is_storage_image* (Chia-I Wu)
use LAYOUT_ASTC for if check (Chia-I Wu)
no 1D support (Chia-I Wu)
calculate start end offset in 2x blk size
v10: remove old wrong code (Chia-I Wu)
v11: use existing defined local format variable (Chia-I Wu)
dst image layout is always VK_IMAGE_LAYOUT_GENERAL (Chia-I Wu)
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24672>
v2: remove extra u_formats.h header addition (Chia-I Wu)
use stddef.h and not unistd.h (Chia-I Wu)
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24672>
We convert from doubles to half by going through float in between, but
as noted in the comment in this commit, that can give wrong results in
some cases.
Add some helpers to ensure correct results based on rounding mode that
will be used in the next commit.
v2: Use fi/di from u_math.h (Ian)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25281>
In the linear allocator, when a size larger than the minimum
buffer size is allocated, we currently create the new buffer
to fit exactly the requested size.
In that case, don't bother updating the `latest` pointer, since
this newly created buffer is already full. In the worst case,
the current `latest` is full and it would be the same; in the
best case, there's still room for small allocations, so we avoid
wasting space if the next allocations would fit.
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25517>
A descriptor set is internally reserved for descriptor set dynamic
offset which might not be used by an applications which otherwise
requires an extra descriptor set. This driconf option allows making
that trade-off by dropping support for dynamic offsets in exchange
for an extra descriptor set which means 5 usable descriptor sets on
A6XX and 8 on A7XX.
Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25534>
In release mode print the ralloc tree header pointers. In debug mode,
will look at the child allocations recursively checking for the canary
values. In most cases this should work and give us extra information
about the type of the subtree (GC, Linear) and the allocation sizes.
For example, calling it at the end of spirv_to_nir() will output the
following tree with a summary at the end (lines elided to focus on the
general structure of the output)
```
0xca8d00 (472 bytes)
0xdf2c60 (32 bytes)
0xdf2c00 (32 bytes)
0xdf2ba0 (32 bytes)
0xdf2b40 (32 bytes)
(...)
0xcde8b0 (64 bytes)
0xcde760 (72 bytes)
0xd2de40 (168 bytes)
0xcce660 (64 bytes)
0xcce510 (72 bytes)
0xcce5a0 (120 bytes)
0xcce490 (64 bytes)
(...)
0xcbc450 (456 bytes)
0xdf55c0 (160 bytes)
0xdf5730 (72 bytes)
0xdf57c0 (80 bytes)
0xdf5530 (72 bytes)
0xdf56a0 (80 bytes)
(...)
0xcbe840 (128 bytes)
0xcc4310 (4 bytes)
0xcbc660 (536 bytes) (gc context)
0xde6b40 (32576 bytes) (gc slab or large block)
0xddb160 (32704 bytes) (gc slab or large block)
0xdc8d50 (32704 bytes) (gc slab or large block)
(...)
0xcde9a0 (32704 bytes) (gc slab or large block)
0xcd6720 (32704 bytes) (gc slab or large block)
0xcce6e0 (32768 bytes) (gc slab or large block)
0xcbc330 (72 bytes)
0xd680a0 (208 bytes)
0xca9010 (78560 bytes)
0xca8f20 (176 bytes)
==== RALLOC INFO ptr=0xca8d30 info=0xca8d00
ralloc allocations = 4714
- linear = 0
- gc = 23
- other = 4691
content bytes = 1055139
ralloc metadata bytes = 226272
linear metadata bytes = 0
====
```
There's a flag to pass so only the summary at the end is printed.
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25482>