This was a hack needed for the old transform feedback code. This barrier
is handled by the explicit XFB emulation that we're using on Midgard
now, so we don't need the barrier in the general case.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19760>
Unconditionally lowering prevents GL drivers from natively
implementing these ops. Drivers that need lowering should set
lower_uadd_carry and lower_usub_borrow on nir_shader_compiler_options to
get the nir lowerings.
Tested with dEQP-GLES31.functional.shaders.builtin_functions.integer.*
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19704>
Calling sched_setscheduler twice every sample period has high CPU
overhead. For intel and panfrost, their dump_perfcnt is preemptible and
they don't need the scheduler change.
For freedreno, simply makes the main thread RT at all time. This solves
most of the cpu overhead issue.
v2: removed pthread_t param and just change the scheduler for the
calling thread
Acked-by: Rob Clark <robdclark@chromium.org> (v1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19668>
This should be the last commit, and should be take care that can only in comment block or
version
Exclude files:
src/util/detect_*.h
From:
PIPE_(OS|ARCH|CC)_([0-9A-Z_]+)
To:
DETECT_$1_$2
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19674>
From:
defined[\s]*\([\s]*PIPE_(OS|ARCH|CC)_([0-9A-Z_]+)[\s]*\)
To:
DETECT_$1_$2
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19674>
Even though the defines in p_config.h are stared with PIPE_, they are indeed
are generic detecting mechanics, we will rename them to DETECT_* in latter MR
We rename src/gallium/include/pipe/p_config.h src/util/detect_arch.h because
the detect code in src/gallium/include/pipe/p_config.h are most about
processor architecture detecting.
The file util/detect.h is added to replace functional of src/gallium/include/pipe/p_config.h
So we replace of #include "pipe/p_config.h" with #include "util/detect.h"
The file util/detect_cc.h is added as a placeholder for moving compiler related macro defines
from p_config.h into it in following commits
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19674>
Fixes: b557ceb7 ("frontends/va: Add windows VA frontend support via vl_winsys_win32 and libva-win32")
Closes: #7702
Signed-off-by: Sil Vilerino <sivileri@microsoft.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19754>
Use "util/detect_os.h" instead of "pipe/p_config.h" and "pipe/p_compiler.h"
in src/util/os_mman.h
This is a prepare to implement os_mman on windows
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19645>
_mesa_get_cpu_features is no more a needed thing as all it's usage are
replaced with util_get_cpu_caps in u_cpu_detect.h
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19564>
cpu_has_sse4_1 doesn't belongs to src/util, so do not depends on it,
this is a follow up of that u_cpu_detect.* doesn't depends on
pipe/p_*.h anymore
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19564>
It's comes from src/gallium/include/pipe/p_config.h and that getting
streaming-load-memcpy.c can not use of it
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19564>
Use the optimization pass to try to remove them, and if that fails,
at least use the core nir lowering pass which is better about fixed-
size memcpys compared to the one we wrote for DXIL.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19752>
s_wqm_b64 clobbers SCC.
Found this while working on dual source blending.
Fixes: 6113ee650a ("aco/gfx11: fix FS input loads in quad-divergent control flow")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19747>
Since b29fe26d43 ("etnaviv: rework ZSA into a derived state") the PE
depth config is adjusted by etna_update_zsa() when the shader is changed.
When the PE depth config is actually changing as a result of this
adjustment the ZSA state is dirtied, so there is no need to emit the
state unconditionally when the shader is changed.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19686>
In linear PE mode the early and late depth stage do not only disagree
about the cache layout, but they seem to fundamentally disagree about
the buffer layout. When Z was written via the late stage, early tests
always show spurious zfails, even if they are not in the same draw
call. Cache flushing and pipe stalls don't help in that case.
The only option to get reliable Z tests with linear render targets is
to move all Z handling into the PE stage. Even when early Z writes
are possible, we don't know if any other draw to the same surface
needs late Z handling, so we must never use the early stage.
Fixes: 53445284a4 ("etnaviv: add linear PE support")
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19686>
Apparently MSAA doesn't only add another input, but it also increases
required temporaries by one. Simple programs where the register demand
is given by the number of inputs did work fine, while more complex ones,
where register demand is given by the number of temporaries exhibit
rendering issues without this fix.
Cc: 22.3 mesa-stable
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19582>
reserve() in rtasm_x86sse compares a pointer difference with some
integers to check if reallocation is needed.
It unfortunately groups the first pointer with an int, which makes it
possible to hit nullptr-with-nonzero-offset under Undefined Behavior
Sanitizer.
This patch suggests a reordering of the arithmetic expression so that
first the pointer difference is computed, and from that on it's just a
usual integer arithmetic, avoiding nullptr-with-nonzero-offset.
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18522>