There's a few changes in here that are very inter-related.
First, we stop lowering load_deref on shader_temp to load_ptr_dxil,
and just leave it as load_deref. In order for that to work, we need
the derefs to be in a shape that's acceptable to DXIL, so the only
current producer of shader_temp loads (the CLC frontend) needs to
run some lowering passes on them first.
The DXIL backend is augmented to just write out deref indices while
walking a deref chain, which will get combined in the load op into
a GEP instruction. For non-mesh/raytracing shaders, these are required
to be single-level scalar arrays, but the complexity here is preparation
for when we don't need to do that anymore.
Additionally, the const lookups are changed from using a hash table
to just putting an index on the variable.
All of this together is enough to enable the authored-forever-ago test
which uses indirect array access into a const packed struct. The
load_ptr_dxil handling didn't deal with packed structs / unaligned
accesses, but now that we're in a logical address space with derefs
instead of physical, there's no alignment to deal with anymore and
the fact that it's packed goes out the window.
This removes one custom DXIL intrinsic.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>
DXIL requires GEP chains to point to a global variable that's a flat
array of primitive types. If we're converting deref chains to GEP
chains, we're effectively in a logical address space, which means
we can do things like change sizes of variables, since we know
they won't alias with anything else. If they could alias, we'd be
lowering them to an explicit I/O op instead. That means we can
start disabling some of the low-bit-size lowering.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>
Now that we try harder for memcpys, we can use nir's complex usage helper.
We also can just mark the vars instead of using a hash map, since location
doesn't mean anything for constant vars.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>
For the case of memset, the SPIR-V translator produces a copy from
a byte array of 0s. If we wait to lower memcpys until after types
are sized, we can potentially turn those 0s into SSA zeros and remove
the entire constant array.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>
I was seeing u2u64 still in my final shader after pack/unpack were
lowered, which sounds to me like some other optimizations are missing
for detecting the post-lowering pack/unpack patterns, but let's at
least add some patterns for the simple cases.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>
We'd like to use this callback to adjust loads and stores from things
that are unsupported to things that are supported, but if the input
is already supported, we'd prefer not to change it. Rather than making
up a bit size that'd work and doing a bunch of pack/unpack bit math,
only return a different bit size if the input one doesn't work for us
(i.e. can't load enough memory or just an unsupported size entirely).
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>
This pass handles 16-component 8-bit loads, 8-component 16-bit loads,
and 2-component 64-bit loads. The number of components for the fallback
case doesn't need to be 4.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>
The var splitting pass can rearrange the variables as long as their
position in memory doesn't matter. For block-arranged variables,
or things like memcpys or casts, the layout matters, but atomics
don't imply anything about the layout of the overall variable, so
don't treat them as "complex" for this use case.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>
Idiomatic DXIL has constants contained within global variables rather
than a big blob of data. Doing this allows us to have 16-bit and 64-bit
data as well, where normally bitcasts would be disallowed on variable
GEP chains.
Unfortunately, DXIL validation requires SOA to be turned into AOS,
which means we need to split structs. We want to be able to run this
on nir_var_mem_constant variables which have constant initializers,
so add a bit of logic to handle that case, and relax the mode validation.
There's nothing special about the modes it was set up to handle.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>
Instead of on Android. Which allows an end user to turn off expat
without breaking or disabling Intel support. I've additionally
refactored to separate expat and xmlconfig a bit more in the root
meson.build
This does make expat a hard dependency for building Intel tools, despite
the fact that only aubinator actually requires it. This simplifies the
build for the common case, and in the event that someone wants to build
the Intel tools and doesn't have libexpat, they can fall back to the
meson wrap for expat instead.
fixes: 75276deebc
closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8791
Reviewed-by: Mark Janes <markjanes@swizzler.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23605>
No longer used now that we don't dynamically generate dispatch stubs.
Acked-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23451>
Since we don't support loading an older driver with newer loader any more,
we don't need to bother tracking entrypoints that Mesa no longer supports.
Acked-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23451>
Mesa core doesn't need to have mapi sanity check that our aliases all map
to the same offset. That's a build-time decision.
Acked-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23451>
We no longer use the address field, and the name is always a size_t offset
in the string pool (never a dynamic strduped name).
Acked-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23451>
Since we don't generate dynamic dispatch stubs any more, we don't need
this data.
Acked-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23451>
Since Mesa drivers are now version-locked to the loader, that means that
we never need to support a newer hardware driver than the loader, and thus
don't need to generate dynamic dispatch stubs. This is great news, given
that we don't test those paths, and it involved delightful features like
arrays of hex for code to be pasted into executable memory.
More code removal will follow, this is the first cut of "don't generate,
and DCE generation code".
Fixes: #9158
Acked-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23451>
The formatting was so broken I couldn't follow what was going on.
Acked-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23451>
While this is a generic bit twiddling ALU instruction, it's especially useful
for address calculations, since the architecture's tiled textures use Morton
coding within the tiles.
This will be used when lowering image_texel_address on AGX, as part of the image
atomics implementation. I don't know if there's any other neat uses I could
detect with opt_algebraic, this doesn't seem like an operation a shader would
open-code... Maybe useful for BVH building or something...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23513>
The find-remove-use pattern is quite natural for texture lowering :)
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23513>
I have this in the AGX compiler but I want to use it in more places.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23513>
This change fixes a buffer overflow by implementing the
special swizzles. This behavior is already available with
evergreen_convert_border_color().
For instance, this issue is triggered on a cayman gpu with
"piglit/bin/texwrap bordercolor -auto -fbo" or "piglit/bin/max-samplers -auto -fbo":
==5610==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000012d20 at pc 0x7fb798cb876f bp 0x7ffd78670460 sp 0x7ffd78670458
READ of size 4 at 0x603000012d20 thread T0
#0 0x7fb798cb876e in cayman_convert_border_color ../src/gallium/drivers/r600/evergreen_state.c:2444
#1 0x7fb798cb876e in evergreen_emit_sampler_states ../src/gallium/drivers/r600/evergreen_state.c:2539
#2 0x7fb7989e6cb2 in r600_emit_atom ../src/gallium/drivers/r600/r600_pipe.h:655
#3 0x7fb7989e6cb2 in r600_draw_vbo ../src/gallium/drivers/r600/r600_state_common.c:2333
#4 0x7fb7985082c7 in u_vbuf_draw_vbo ../src/gallium/auxiliary/util/u_vbuf.c:1497
#5 0x7fb796ef2eda in cso_draw_vbo ../src/gallium/auxiliary/cso_cache/cso_context.h:262
#6 0x7fb796ef2eda in st_draw_gallium_multimode ../src/mesa/state_tracker/st_draw.c:170
#7 0x7fb7970d9cfd in vbo_exec_vtx_flush ../src/mesa/vbo/vbo_exec_draw.c:341
#8 0x7fb7970d32d7 in vbo_exec_FlushVertices_internal ../src/mesa/vbo/vbo_exec_api.c:693
#9 0x7fb7970d32d7 in vbo_exec_FlushVertices ../src/mesa/vbo/vbo_exec_api.c:1193
#10 0x7fb7975f237c in enable_texture ../src/mesa/main/enable.c:337
Fixes: 923d635357 ("r600: fix some border color swizzles on CAYMAN")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23435>
Since b167203cfe ("mesa/st: Always generate NIR from GLSL, and use
nir_to_tgsi for TGSI drivers."), it's always set in the context.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23114>
The GL frontend is about to start only querying against NIR, since it only
generates NIR.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23114>
Now that we don't have to generate TGSI, we can drop an indirection.
Fixing up the types here prompted a little fixup of
st_nir_make_passthrough_shader()'s types.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23114>
Now everyone's saying NIR, and doing any NTT internally. The only returns
of TGSI were in gallivm_get_shader_param() and
tgsi_exec_get_shader_param(), but the drivers were returning NIR instead
of calling down to them.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23114>
The flag has been here for a long time, it's time for SVGA to start
ingesting NIR like other drivers do.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23114>
The NIR path is default, and well-trodden at this point. By dropping
support for input as TGSI, we get rely on the NIR trig lowering.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23114>