Hoping that I didn't miss any, this *should* add assertions
to all functions and passes which explicitly handle 'nir_loop'.
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13962>
Builds on the work of !15121. This gets to delete even more code
because many drivers shared a lot of code for i2b and f2b.
No shader-db or fossil-db changes on any Intel platform.
v2: Rebase on 1a35acd8d9.
v3: Update a comment in nir_opcodes_c.py. Suggested by Konstantin.
v4: Another rebase. Remove f2b stuff from Midgard.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20509>
Now that renderonly.h includes util/simple_mtx.h, which itself includes
valgrind.h, dep_valgrind is required by any module that includes
renderonly.h.
In file included from ../src/gallium/auxiliary/renderonly/renderonly.h:33,
from ../src/gallium/winsys/kmsro/drm/kmsro_drm_winsys.c:39:
../src/util/simple_mtx.h:34:12: fatal error: valgrind.h: No such file or directory
34 | # include <valgrind.h>
| ^~~~~~~~~~~~
compilation terminated.
dep_valgrind is part of idep_mesautil, which should be used instead of
copying the list of deps for each util header included (which would
have to be updated every time a util header changes its own includes),
so let's add idep_mesautil everywhere that includes renderonly.h.
Fixes: ad4d7ca833 ("kmsro: Fix renderonly_scanout BO aliasing")
Tested-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20530>
When importing a BO, if it is already imported, then the handle will
alias an existing BO instance. It is possible for the existing owner to
free the BO after the import and leave a dangling handle before we get a
chance to increase the refcount, so we need to lock the BO table mutex
before importing, to make sure nobody else goes through the free path
during that window.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20403>
There are a lot of optimizations in opt_algebraic that match ('ine', a,
0), but there are almost none that match i2b. Instead of adding a huge
pile of additional patterns (including variations that include both ine
and i2b), always lower i2b to a != 0.
At this point in the series, it should be impossible for anything to
generate i2b, so there /should not/ be any changes.
The failing test on d3d12 is a pre-existing bug that is triggered by
this change. I talked to Jesse about it, and, after some analysis, he
suggested just adding it to the list of known failures.
v2: Don't rematerialize i2b instructions in dxil_nir_lower_x2b.
v3: Don't rematerialize i2b instructions in zink_nir_algebraic.py.
v4: Fix zink-on-TGL CI failures by calling nir_opt_algebraic after
nir_lower_doubles makes progress. The latter can generate b2i
instructions, but nir_lower_int64 can't handle them (anymore).
v5: Add back most of the hunk at line 2125 of nir_opt_algebraic.py. I
had accidentally removed the f2b(bf2(x)) optimization.
v6: Just eliminate the i2b instruction.
v7: Remove missed i2b32 in midgard_compile.c. Remove (now unused)
emit_alu_i2orf2_b1 function from sfn_instr_alu.cpp. Previously this
function was still used. 🤷
No shader-db changes on any Intel platform.
All Intel platforms had similar results. (Ice Lake shown)
Instructions in all programs: 141165875 -> 141165873 (-0.0%)
Instructions helped: 2
Cycles in all programs: 9098956382 -> 9098956350 (-0.0%)
Cycles helped: 2
The two Vulkan shaders are helped because of the "new" (('b2i32',
('ine', ('ubfe', a, b, 1), 0)), ('ubfe', a, b, 1)) algebraic pattern.
Acked-by: Jesse Natalie <jenatali@microsoft.com> [earlier version]
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Daniel Schürmann <daniel@schuermann.dev> [earlier version]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
Supported in all hardware and software drivers. Only that don't support
are virgl and svga, depending on host capabilities. I don't think
there's anything to be done there. This does give fewer places to screw
up the CAPs, though, because everyone wants ARB_buffer_storage.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Marek Ol<C5><A1><C3><A1>k <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19392>
This means we don't have the variant keys, and need to make up one
variant and pre-compile it.
Signed-off-by: Eric Engestrom <eric@igalia.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18599>
This is known to produce bogus results for certain combinations of
operands, so don't use it. See this issue for details:
https://gitlab.freedesktop.org/mesa/mesa/-/issues/6555
None of the shaders in shaderdb are affected by this change, as all of
the shaders containing an integer division require a higher GLSL version
than vc4 supports and are therefore all skipped.
This is port of the v3d change 73e8fc3efb
to vc4; we expect the impact to be the same as for v3d, based on testing
a custom shader.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18019>
We can produce slightly better code for these in the backend, so
do that.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18019>
All current drivers reports supporting this cap, let's just assume
it's always supported.
It seems better to lower this in the drivers, like we already do for
etnaviv, panfrost and zink...
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Begrudgingly-reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19049>
We only copy a single layer, so let's not even try to support deep
blits here.
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Tested-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18427>
Whenever we add or remove a flag here, we need to update a bunch of
drivers in a fragile way. Moving to flags here instead should make this
a bit easier to maintain in the future.
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17959>
The main issue was the inconsistent use of `unlikely()`, but the macro
also simplifies the code a little bit.
Signed-off-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18086>
When unpacking the texture sample result ensure it is moved to the
proper expected dest register.
This fixes incorrect texturing in Chromium using PixiJS framework.
CC: mesa-stable
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18122>
So far we create dumb buffers to emulate the ioctls in VC4, but these
doesn't work when using an Intel or AMD GPUs.
As in the case of V3D, let's use specific ioctls for these cases.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Tested-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Iago Toral Quroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17872>
As it is needed to have V3D_DEBUG defined. For the v3d case, I did it
restoring v3d_process_debug_variable, as it is at v3d_debug.c that
DEBUG_GET_ONCE_FLAGS_OPTION is called.
Fixes: 106b33405e ("vc4/v3d: stop adding NORAST when SHADERDB debug option is used")
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17812>
Right now if we use the option SHADERDB, NORAST is added
automatically. There's no comment justifying it, neither a lot of info
on the commits that added that. But I guess that the purpose is that
SHADERDB option is assumed to be used only to gather shader-db stats,
so setting setting NORAST would allow to get those dumps faster.
But adding debug options automatically can be confusing, as we could
get a behaviour that we were not expecting. At least I needed to check
why using SHADERDB was getting a black screen. And if we want to get
this behaviour, we can easily add manually the NORAST.
Finally, v3dv doesn't support NORAST right now (and we don't have
immediate plans to implement it), so it is somewhat inconsistent to
get different behaviour from the same debug option from the two
drivers.
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17788>
The hardware doesn't support 3D textures. We had been lying about 3D
texture level support in the past so that we got GL 2.1, but now reporting
levels==0 doesn't disable GL 2.1 (since we don't check for GL2 extensions
any more). But, by not lying, we now fix the majority of the remaining
GLES2 deqp failures.
This regresses a few desktop GL piglits which get GL errors that they
notice instead of what would be silent rendering failures on 3D texturing
operations.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17350>
This was missing, and the added validation caught it.
Fixes: 708c47e663 ("nir: Validate nir_tex_instr::dest_type bitsize")
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17172>
This is used for the old, buggy and slow GLSL IR loop unrolling
code. All drivers have now switched to the NIR unrolling code so
here we remove the CAP.
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16366>
The only interesting ones here were LOWER_IF_THRESHOLD (which previously
had connected to some lowering in GLSL that was broken in the face of side
effects), and FMA (which turned GLSL IR's fma() into TGSI_OPCODE_FMA
instead of MAD).
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8044>
This controls the whole lowering of "make tex ops with implicit
derivatives on non-implicit-derivative stages be tex ops with an explicit
lod of 0 instead", but it's really hard to describe that in a git commit
summary.
All existing callers get it added except:
- nir_to_tgsi which didn't want it.
- nouveau, which didn't want it (fixes regressions in shadowcube and
shadow2darray with NIR, since the shading languages don't expose txl of
those sampler types and thus it's not supported in HW)
- optional lowering passes in mesa/st (lower_rect, YUV lowering, etc)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16156>
util_cpu_detect is an anti-pattern: it relies on callers high up in the call
chain initializing a local implementation detail. As a real example, I added:
...a Mali compiler unit test
...that called bi_imm_f16() to construct an FP16 immediate
...that calls _mesa_float_to_half internally
...that calls util_get_cpu_caps internally, but only on x86_64!
...that relies on util_cpu_detect having been called before.
As a consequence, this unit test:
...crashes on x86_64 with USE_X86_64_ASM set
...passes on every other architecture
...works on my local arm64 workstation and on my test board
...failed CI which runs on x86_64
...needed to have a random util_cpu_detect() call sprinkled in.
This is a bad design decision. It pollutes the tree with magic, it causes
mysterious CI failures especially for non-x86_64 developers, and it is not
justified by a micro-optimization.
Instead, let's call util_cpu_detect directly from util_get_cpu_caps, avoiding
the footgun where it fails to be called. This cleans up Mesa's design,
simplifies the tree, and avoids a class of a (possibly platform-specific)
failures. To mitigate the added overhead, wrap it all in a (fast) atomic
load check and declare the whole thing as ATTRIBUTE_CONST so the
compiler will CSE calls to util_cpu_detect.
Co-authored-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15580>
It might be possible to combine this with the other merge to avoid
the overheads of making a temp copy.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15516>
These aren't spiecic to TGSI any more, so let's rename them to reflect
reality.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15316>
These aren't specific to TGSI, so let's rename them to reflect the
reality.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15316>