Add a pointer_initializer field to nir_variable analogous to
constant_initializer, which can be used to initialize the nir_variable
to a pointer to another nir_variable. Just like the
constant_initializer, the pointer_initializer gets eliminated in the
nir_lower_constant_initializers pass.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3047>
These were missed when the fields got added. Added it everywhere where
texture_index got used and it made sense.
Found this in "The Surge 2", where the inliner does not copy the fields,
resulting in corruption and hangs.
Fixes: 3bd5457641 "nir: Add a lowering pass for non-uniform resource access"
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1203
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3246>
Nir serializes uses nir_ssa_alu_instr_src_components in a few places to
determine how many components a src has, but that's not what this function
returns. It simply returns how many channels are used, which is still fine
for most of the code.
This was breaking code like this:
vec16 32 ssa_1 = intrinsic load_global
vec1 32 ssa_2 = fmax ssa_1.a, ssa_2.b
v2: make the 16bit encoding work for identify swizzles again
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Only NPOT vectors greater than vec4 use the extra uint32.
This is for instructions that share the dest code.
load_const and undef already support 1-16 in the header.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
vec4 scalarized ALUs typically have 4 equal instruction headers, so remove
the last 3.
There are no bits left in the ALU header for more flags, so future
extensions of NIR will have to use something like instr_type == 15
to describe more complex ALU instructions.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
It can be derived from src and var. This frees 10 bits in the header
that will be used later.
"mode" is moved in the structure, because those bits will be used for
something else later.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
- type_cast: deduplicate types if the last one is the same
- derive the type from the parent for other derefs
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
The majority of constants can be packed like this.
v2: - use enum for the packing encoding,
- trim packed_value to 20 bits add 1 bit to last_component,
which simplifies a later commit
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Store a flag stating if there was an implmentation, and use
fxn->impl as a temporary flag between deserializsation stages.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Suggested by Jason.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
That means we have only 30 bits for object IDs, because 2 bits are
sometimes used for something else.
This decrease the uncompressed shader size for the biggest Borderlands 2
shader from 33.6 KB to 23.2 KB. (31% decrease)
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
so that drivers don't have to call nir_strip manually.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
They indicate the operation does not cause overflow or underflow.
This is motivated by SPIR-V decorations NoSignedWrap and
NoUnsignedWrap.
Change the storage of `exact` to be a single bit, so they pack
together.
v2: Handle no_wrap in nir_instr_set. (Karol)
v3: Use two separate flags, since the NIR SSA values and certain
instructions are typeless, so just no_wrap would be insufficient
to know which one was referred to. (Connor)
v4: Don't use nir_instr_set to propagate the flags, unlike `exact`,
consider the instructions different if the flags have different
values. Fix hashing/comparing. (Jason)
Reviewed-by: Karol Herbst <kherbst@redhat.com> [v1]
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
v2: remove & operator in a couple of memsets
add some memsets
v3: fixup lima
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)
This commit adds new nir_load/store_scratch opcodes which read and write
a virtual scratch space. It's up to the back-end to figure out what to
do with it and where to put the actual scratch data.
v2: Drop const_index comments (by anholt)
Reviewed-by: Eric Anholt <eric@anholt.net>
We have a pass to lower global registers to locals and many drivers
dutifully call it. However, no one ever creates a global register ever
so it's all dead code. It's time we bury it.
Acked-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
All we ever do is initialize it to zero, clone it, print it, and
validate it. No one ever sets or uses it.
Acked-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Values inside the offsets parameter of textureGatherOffsets are required to be
constants in the range of [GL_MIN_PROGRAM_TEXTURE_GATHER_OFFSET,
GL_MAX_PROGRAM_TEXTURE_GATHER_OFFSET].
As this range is never outside [-32, 31] for all existing drivers inside mesa,
we can simply store the offsets as a int8_t[4][2] array inside nir_tex_instr.
Right now only Nvidia hardware supports this in hardware, so we can turn this
on inside Nouveau for the NIR path as it is already enabled with the TGSI one.
v2: use memcpy instead of for loops
add missing bits to nir_instr_set
don't show offsets if they are all 0
v3: default offsets aren't all 0
v4: rename offsets -> tg4_offsets
rename nir_tex_instr_has_explicit_offsets -> nir_tex_instr_has_explicit_tg4_offsets
Signed-off-by: Karol Herbst <kherbst@redhat.com>
The nir_state_slot struct had some padding that was never initialized.
Serializing the individual parts of the struct is more robust and avoids
the overhead of zeroing it at creation, so just do that.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Replace calls to create hash tables and sets that use
_mesa_hash_pointer/_mesa_key_pointer_equal with the helpers
_mesa_pointer_hash_table_create() and _mesa_pointer_set_create().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Eric Engestrom <eric@engestrom.ch>
We're going to have multiple functions, so nir_shader_get_entrypoint()
needs to do something a little smarter.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
These correspond directly to SPIR-V's OpPtrAccessChain. As such, they
treat whatever their parent gives them as if it's the first element in
some array and dereferences that array. If the parent is, itself, an
array deref, then the two indices can just be added together to get the
final array deref. However, it can also be used in cases where what you
have is a dereference to some random vec2 value somewhere. In this
case, we require a cast before the ptr_as_array and use the ptr_stride
field in the cast to provide a stride for the ptr_as_array derefs.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
nir_sweep assumes that constants area always allocated off the variable
to which they belong. Violating this assumption causes them to get
freed early and leads to use-after-free bugs.
Fixes: 120da00975 "nir: add serialization and deserialization"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107366
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
This commit adds a concept to NIR of having a blob of constant data
associated with a shader. Instead of being a UBO or uniform that can be
manipulated by the client, this constant data considered part of the
shader and remains constant across all invocations of the given shader
until the end of time. To access this constant data from the shader, we
add a new load_constant intrinsic. The intention is that drivers will
eventually lower load_constant intrinsics to load_ubo, load_uniform, or
something similar. Constant data will be used by the optimization pass
in the next commit but this concept may also be useful for OpenCL.
v2 (Jason Ekstrand):
- Rename num_constants to constant_data_size (anholt)
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>