On some malloc implementation, malloc doesn't always align to 16
bytes even on 64 bits system. To make sure ralloc_header always
starts at the wanted alignment, just force the size to be aligned at
the alignment of ralloc_header. This fixes crashed on instruction
like "movaps %xmm0,0x10(%rax)" which requires aligned memory access.
Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6314>
We were calling the table-based unpack functions from inside the pack and
unpack table's methods, so if anything included these pack functions (such
as a call to a table-based pack function), you'd pull in all of unpack as
well.
By calling them explicitly, we save some overhead in these functions
(switch statement, address math on the zero x,y arguments) anyway.
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6307>
Like we just did for pack functions for freedreno, it will be useful to be
able to pick out a specific rgba unpack function instead of going through
the table.
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6307>
Game does the equivalent of a
ALIGN(..., minUniformBufferOffsetAlignment >> 4)
which breaks when said alignment is <16 with a SIGFPE.
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6120>
This adds applicationName + version through like engineName.
Rationale: A game (World War Z) includes the store name in the
executable name, so has multiple executable names.
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6120>
The vallium layer has a requirement to insert and extra the 24-bit
unorm value as a unorm value (not as a float etc). Add helpers
to facilitate that.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6082>
This lets the compiler CSE calls to them on the same format. This is
particularly relevant for the description table lookup calls, which other
inlines might do internally.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6305>
A single format either had the float, the sint, or the uint version.
Making the dst be void * lets us store them in the same slot and not have
logic in the callers to call the right one.
-6kb on gallium drivers
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6305>
Changes are necessary after commit 84ed2d098
("util/format: expose generated format packing functions through a header")
because script for format/u_format_pack.h has different commandline
compared to existing pattern
Generated sources rules are ported from meson
Fixes: 84ed2d098 ("util/format: expose generated format packing functions through a header")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6319>
Some of the generated functions can be useful without going through the
format table (filling border color struct in turnip). By not calling these
functions through the format table, we should eventually be able to garbage
collect the unused packing functions, and also allows LTOs to happen.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6093>
I did this as a separate commit to make the previous one more reviewable.
Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5826>
This saves us 13 to 35kb on release drivers in my builds.
Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5826>
Fix warnings reported by Coverity Scan.
Resource leak (RESOURCE_LEAK)
leaked_storage: Variable bt1 going out of scope leaks the storage
it points to.
leaked_storage: Variable bt2 going out of scope leaks the storage
it points to.
Fixes: d0d14f3f64 ("util: Add unit test for stack backtrace caputure")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6246>
The option is not a toggle to "allow GPU vendor to be overridden", it
*is* the override.
Fixes: dca119f12c ("mesa/gallium: add dric option to allow overriding GL vendor string")
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6207>
The previous implementation kept a hashtable of a Backtrace object per
thread. debug_backtrace_capture is supposed to store a backtrace in
the passed in debug_stack_frame array, but instead overwrote the
per-thread Backtrace object.
This new version works more like the libunwind based capture. We hash
the file and symbol names and store pointers in the debug_stack_frame
struct. This way debug_backtrace_capture doesn't overwrite previous
captures or allocate memory that needs to be freed.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6112>
The generated .c had a bunch of NULLs and notes for what kind of function
was being skipped, when we can just skip them by filling in the fields
with names.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5728>
It will return the same table every time with no other side effects, so we
want it to be CSEed. Saves 3.5k on my aarch64 GL drivers, almost 9k on
turnip, but weirdly increases my x86 GL driver collection by ~3k.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5728>
The compton compositor is unmaintained, with a new fork named picom taking
its place. As with the other compositors (including compton), adaptive
sync should not be enabled.
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5740>
XXH32 is doing access through u32 *, and with strict aliasing the compiler
gets to assume that those are independent of the u16 writes we did in
fd6_texture_key setup, and based on various tweaks to the code, would
result in bad hashes computed after inlining. The failure was:
../src/util/hash_table.c:326:_mesa_hash_table_search_pre_hashed: Assertion
`ht->key_hash_function == ((void *)0) || hash == ht->key_hash_function(key)'
failed.)
By setting these two flags, we always take the unaligned,
memcpy-the-32-bit-data path. I believe this should be same perf on x86
(which will happily unaligned load 32 bits in the end), while it will be
slower on arm (where you have to a special unaligned load operation iirc).
This should still be far faster than our old hash.
Fixes: edd62619a1 ("freedreno: replace fnv1a hash function with xxhash")
Acked-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5271>
DOOM Eternal happily creates a swapchain with 2 images for IMMEDIATE.
This fixes a 10% performance issue with RADV.
Cc: 20.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5704>