Similar to util_format_get_component_bits, get the bit offset for a
particular channel in a given format.
Use this to calculate the shift/mask sets for formats when creating DRI
configs, as a prelude to ripping out and replacing the hardcoded table.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27709>
needs some nontrivial wrapping but can mostly fallback to the regular impl.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27762>
Some D3D11 games rely on out-of-bounds indirect UBO loads to return
real values from underlying bound descriptor. This workaround would
prevent us from lowering indirectly accessed UBOs to consts.
Later DXVK would declare dynamically indexed uniforms with upper
size bound, to make the accesses spec compliant. But for now
we need our own workaround.
Known affected games:
- Dark Souls 3
- Sekiro: Shadows Die Twice
- Final Fantasy Type-0 HD
- Ultrakill
- Dishonored 2
DXVK discussions:
- https://github.com/doitsujin/dxvk/issues/405
- https://github.com/doitsujin/dxvk/issues/3861
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27727>
This adds the following template options:
- add an option to fill TC set_vertex_buffers from st_update_array directly
(always true without u_vbuf, so always used with radeonsi)
- add an option saying that there are no zero-stride attribs
- add an option saying that there are no user buffers
(always true with glthread, so always used with radeonsi)
- add an option saying that there is an identity mapping between vertex
buffers and vertex attribs
I have specifically chosen those options because they improve performance.
I also had other options that didn't, like unrolling the setup_arrays loop.
This adds a total of 42 variants of st_update_array_templ for various cases.
Usually only a few of them are used in practice.
Overhead of st_prepare_draw in VP2020/Catia:
Before: 8.5% of CPU used
After: 6.13% of CPU used
That's 2.37% improvement. Since there are 4 threads using the CPU and
the percentage includes all threads in the system, the improvement for
the GL thread is about 8% (roughly 2.17% * 4; each thread at 25% of global
utilization means 100% utilization in 4 cores).
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27731>
XeSS workaround is required for Hitman 3 to launch for Intel cards.
The following was observed during launch, coming from libxess.dll:
"Intel Plugin Extension ERROR: INTC_LoadExtensionsLibrary failed"
Signed-off-by: Casey Bowman <casey.g.bowman@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27477>
Persona 3 Reload is largely affected by the way amplification works with
NGG GS and disabling it drastically improve performance.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27518>
SPECviewperf creo-03 needs GL_EXT_shader_image_load_store in order for
its shaders to compile but we don't support a few corner cases that
didn't make it into the ARB variant. It seems to run fine with an
override, so just do that for now.
Cc: mesa-stable
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27429>
1.2.9 has been released in January 2017, so let's assume
we'll find it everywhere.
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27311>
AoE4 samples texture on the edge between texels, which can cause
unexpected texel to be returned, and cause misrenderings. This workaround
enables coordinate rounding even in NEAREST mode, which fixes the problem.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9864
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27337>
except leaving u_endian.h behind to use __ANDROID__ directly to be
consistent with the rest in that file, which deserves a different
refactor
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27374>
Change force_vk_vendor=-1 for Red Dead Redemption 2 to avoid
pop up driver warning on launch. Warning only applies to Windows
driver.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27364>
The only change in behavior is that setting the affinity mask is skipped
when it has no effect, which happens when the app thread hasn't been moved
under a different L3 cache by the kernel (the sched_state variable tracks
that). This improves performance because setting the affinity mask incurs
measurable CPU overhead even if it has no effect.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27247>
"__ANDROID__ is defined by the compiler. It is defined for
all Android targets."
"ANDROID is set by some build systems, and not in any reliable
manner (in AOSP it actually just means that you're in AOSP; it's
set even when you're building host code for Windows, so it
doesn't mean much of anything)."
https://github.com/android/ndk/issues/407https://github.com/android/ndk/issues/807
Since the toolchain provides __ANDROID__ for actual Android
targets, standalone NDK and Soong builds should work.
The Mesa CI uses a combination of the NDK toolchain, internal
Android stubs and a Debian sysroot when compile testing Android.
Long-term, it should be possible to remove ANDROID and use
DETECT_OS_ANDROID everywhere.
Reviewed-by: Aaron Ruby <aruby@blackberry.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26874>
Game needs strict image count to not crash in non-vsync mode.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27038>
Talos Principle and related games on that engine fail to handle
SUBOPTIMAL properly. Adds an option to ignore SUBOPTIMAL and pretend
it's SUCCESS.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27037>
DOOM Eternal builds acceleration structures with inactive primitives and
tries to make them active in later AS updates. This is disallowed by the
spec and triggers a GPU hang. Fix the hang by working around the bug.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27034>
This function ends up getting called an awful lot when loading a large
Fossilize cache db, so let's replace strtol with something more
reasonable.
For a game with 97k shaders, this reduces instance creation time from
203ms to 52ms.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27052>