BITSET_FOR_EACH_SET can walk a sparse set (such as a register class's set
of registers) much faster than just iterating over individual bits.
Improves freedreno startup time (as measured by shader-db ./run
shaders/closed/gputest/triangle on my x86 system) by -4.12679% +/-
1.99006% (n=151)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4537>
This make the code significantly more readable, I think (along with
shorter). Also, using util_dynarray_delete_unordered() saves us a move of
the rest of the list when removing adjacency on a node.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4537>
This moves the fi_types to a new mesa_private.h and removes the
imports.c file. The vast majority of this patch is just removing
pound includes of imports.h and fixing up the recursive includes.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3024>
MSVC 2015 and newer has perfectly valid snprintf and vsnprintf
implementations, let's just use those.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3024>
Mesa has one of these in imports.h, so u_memory needs one as well. This
is the version from mesa ported.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3024>
This makes more sense for it, it's only used in the glsl compiler
currently, so we could probably move it there, but this seems fine for a
header only #define.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3024>
This adds two new util functions to rounding.h, _mesa_iroundf and
mesa_lround, which are just wrappers around roundf and round, that cast
to int and long int respectively. This is possible since mesa recently
dropped support for VC2013, since 2015 and 2017 support roundf.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3024>
Which has the same behavior as long as you don't change the FPU rounding
mode. Other code in mesa makes the same assumption so it should be safe
to make that assumption more generally.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3024>
which are exactly the same function with exactly the same implementation
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3024>
This is copied from the one in src/mesa/main/imports.h, which is the
same otherwise.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3024>
The implementation is somewhat different, although if you go back in
time far enough they're the same, but the one in u_math was changed a
long time back to be faster.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3024>
The 64 bit variant in imports.h isn't even used.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3024>
Mostly this uses util_is_power_of_two_or_zero, which has the same
behavior as _mesa_is_pow_two when the input is zero. In cases where the
value is known to be != 0 ahead of time I used the _nonzero variant as
it may be faster on some platforms.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3024>
Probably doesn't fix anything but those should be accessed in an
atomic way just like the head pointer.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: e4f01eca3b ("util: Add a free list structure for use with util_sparse_array")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4613>
By temporarily storing the new_head by a uint32_t, we wipe out the
counter section of the head pointer.
Fixes: e4f01eca ("util: Add a free list structure for use with util_sparse_array")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4612>
This just silences a compiler-warning about a potentially uninitialized
variable. It's not uninitialized, but it's a bit hard for the compiler
to see. So let's just initialize it to zero.
Reviewed-by: Brian Paul <brianp@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4577>
These are less mesa specific than the rest of macros.h, and would be
nice to use outside of mesa. Prep for next patch.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4381>
Let's make it clear what includes are being added everywhere, so that
they can be cleaned up.
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4360>
Windows provides MAX_PATH rather than PATH_MAX for the maximum allowable
path length. This is not a limit on the length of filename which can
exist on the filesystem, but a length on the length of path which can be
passed to Win32 API calls.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Fixes: f8f1413070 ("util/u_process: add util_get_process_exec_path")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4304>
This increases performance by 11-13% in Viewperf11/Catia - first scene.
Set allow_draw_out_of_order=true to enable this.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4152>
This implementation was removed by 8b8af6d3 ("gallium/util: Switch
util_float_to_half to _mesa_float_to_half()'s impl.")
It was not actually broken, but _mesa_float_to_half() implements
round-to-nearest-even, whereas util_float_to_half() implemented
round-to-zero. So rename it appropriately.
GL actually never cares about rounding (except a broken piglit test),
however d3d10 very much does and requires RTZ for float to half
conversion. Moreover, apparently at least radeon gpus actually always
do RTZ when doing RT writes (and I'd suspect for shader image writes
as well). Hence it seems appropriate to hook up this rtz function to
the format instead. This will cause llvmpipe and softpipe to use rtz
rounding for clears with half float formats, and softpipe would use rtz
behavior for rt writes as well (llvmpipe has that hardcoded), not sure
if "real" hw drivers hit this function for much.
(For shader opcodes would still need to figure out what rounding to use
appropriately, but this is a question for another day.)
Note should probably unify with _mesa_float_to_float16_rtz. Unclear at
this point which one is better, so just restore previous function here.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4312>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4312>
This is useful to enable workarounds for applications with a generic name.
For instance all games made with the YoYo game engine have the same executable
name "runner".
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4181>