so that users don't have to shift it at every use. It was supposed to be
like this from the beginning.
Fixes: fb994f44d9 - util: make BITSET_TEST_RANGE_INSIDE_WORD take a value to compare with
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29187>
(cherry picked from commit 5502ecd771)
getenv("HOME") is significantly faster than getpwuid_r(...)->pw_dir,
because the latter may require loading NSS libraries, reading
/etc/passwd, etc.
Furthermore, the Linux man pages for getpwuid_r recommend using
getenv("HOME") instead of getpwuid_r, because the user may wish to
change the value of HOME after logging in.
Signed-off-by: Manuel Stoeckl <code@mstoeckl.com>
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13614>
Limited to vkd3d right now, there are specific use cases there.
We don't want any app to disable compression, it should be mostly
transparent and we better be aware of potential bugs.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28632>
When TSAN is enabled we use standard mutexes instead of futexes. With
futexes the fence->signalled is read using an atomic operation, to best
mimic this let's protect the read with a locked mutex.
This avoids TSAN reporting a race condition (false positive with
futexes) with Zink when accessing the pipeline cache.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28650>
Thread sanitizer doesn't support futexes, so don't use them in this case
and fall back to standard mutexes. With that we can avoid tsan reporting
a large number of false positives.
v2: use #if instead of #ifdef to test the value of the define
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28650>
The timeout compute is invalid
Fixes: 095dfc6caa ("util: Move the implementation of futex_wake and futex_wait from futex.h to futex.c")
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28473>
This is achieved by the following steps:
#ifndef DEBUG => #if !MESA_DEBUG
defined(DEBUG) => MESA_DEBUG
#ifdef DEBUG => #if MESA_DEBUG
This is done by replace in vscode
excludes
docs,*.rs,addrlib,src/imgui,*.sh,src/intel/vulkan/grl/gpu
These are safe because those files should keep DEBUG macro is already excluded;
and not directly replace DEBUG, as we have some symbols around it.
Use debug or NDEBUG instead of DEBUG in comments when proper
This for reduce the usage of DEBUG,
so it's easier migrating to MESA_DEBUG
These are found when migrating DEBUG to MESA_DEBUG,
these are all comment update, so it's safe
Replace comment /* DEBUG */ and /* !DEBUG */ with proper /* MESA_DEBUG */ or /* !MESA_DEBUG */ manually
DEBUG || !NDEBUG -> MESA_DEBUG || !NDEBUG
!DEBUG && NDEBUG -> !(MESA_DEBUG || !NDEBUG)
Replace the DEBUG present in comment with proper new MESA_DEBUG manually
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28092>
Make the filename more descriptive and since the file is used by
multiple drivers, move it into appropriate util/ directory.
Cosmetics:
- use SPDX license tag
- add newline before main function
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27804>
These have been incredibly useful when debugging regressions and weird
behavior in the Intel backend when trying to spill multiple registers
before retrying allocation. With them, we can print out not only what
register was chosen, but the benefit and cost. Seeing lists of chosen
registers where the benefit/cost was not sorted, and poor options were
chosen before better ones, led me to investigate a number of issues.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28257>
we typically use a hash table with a fixed struct key, but this requires tedious
boilerplate. add a macro that generates all the boilerplate for you so you can
just create a table and go.
naming inspired by Rust #![derive].
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28056>
Similar to util_format_get_component_bits, get the bit offset for a
particular channel in a given format.
Use this to calculate the shift/mask sets for formats when creating DRI
configs, as a prelude to ripping out and replacing the hardcoded table.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27709>
needs some nontrivial wrapping but can mostly fallback to the regular impl.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27762>
Some D3D11 games rely on out-of-bounds indirect UBO loads to return
real values from underlying bound descriptor. This workaround would
prevent us from lowering indirectly accessed UBOs to consts.
Later DXVK would declare dynamically indexed uniforms with upper
size bound, to make the accesses spec compliant. But for now
we need our own workaround.
Known affected games:
- Dark Souls 3
- Sekiro: Shadows Die Twice
- Final Fantasy Type-0 HD
- Ultrakill
- Dishonored 2
DXVK discussions:
- https://github.com/doitsujin/dxvk/issues/405
- https://github.com/doitsujin/dxvk/issues/3861
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27727>
This adds the following template options:
- add an option to fill TC set_vertex_buffers from st_update_array directly
(always true without u_vbuf, so always used with radeonsi)
- add an option saying that there are no zero-stride attribs
- add an option saying that there are no user buffers
(always true with glthread, so always used with radeonsi)
- add an option saying that there is an identity mapping between vertex
buffers and vertex attribs
I have specifically chosen those options because they improve performance.
I also had other options that didn't, like unrolling the setup_arrays loop.
This adds a total of 42 variants of st_update_array_templ for various cases.
Usually only a few of them are used in practice.
Overhead of st_prepare_draw in VP2020/Catia:
Before: 8.5% of CPU used
After: 6.13% of CPU used
That's 2.37% improvement. Since there are 4 threads using the CPU and
the percentage includes all threads in the system, the improvement for
the GL thread is about 8% (roughly 2.17% * 4; each thread at 25% of global
utilization means 100% utilization in 4 cores).
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27731>