The bulk of it can be reused to implement iris's internal non-bindless
image lowering, which I would like to reuse in freedreno, v3d, and
nir-to-tgsi.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3728>
This eliminates conversions between f16 and f32 where possible. We can
always remove an upcast followed by a down cast, that is:
f2f16 ( f2f32 (a) ) -> a
f2fmp ( f2f32 (a) ) -> a
In the other direction, f2f16 loses precision and can't be undone by a
f2f32. However, by definition it's always safe to elminate f2fmp:
f2f32 ( f2fmp (a) ) -> a
v2. [Neil Roberts (nroberts@igalia.com)]
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3822>
This opcode is the same as the f2f16 opcode except that it comes with
a promise that it is safe to optimise it out if the result is
immediately converted back to a 32-bit float again. Normally this
would be a lossy conversion and so it would be visible to the
application, but if the conversion is generated as part of the mediump
lowering process then this removal doesn’t matter. The opcode is
eventually replaced with a regular f2f16 in the late optimisations so
the backends don’t need to handle it.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3822>
Because it's in double-quotes, it will search the current folder before
any search paths. Since nir_builder.h and nir_builtin_builder.h are in
the same folder, this guarantees a correct include. However,
nir/nir_builder.h does not unless the includer's path is set up just
right.
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3897>
To facilitate lowering SSBOs to globals, we need a load_ssbo_address
intrinsic. This intrinsic takes an SSBO index and loads the address in
global memory of the SSBO (likely implemented via a uniform in the
driver). In the future, we'll support bounds checking, but at the moment
this is not supported (this pass should only be used for trusted
contexts at the moment, i.e. contexts without robustness extensions).
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2753>
"index" is an offset into a linearized 3-dimensional array. Starting
with fbd5359a0a, the 3-dimensional array can have 43 elements in each
dimension. 43**3 = 79507, and that will overflow the uint16_t.
See also the discussion in MR !3765.
Fixes: fbd5359a0a ("nir/algebraic: Rearrange bcsel sequences generated by nir_opt_peephole_select")
Suggested-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3871>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3871>
==7265== 248 (120 direct, 128 indirect) bytes in 1 blocks are definitely lost in loss record 1,438 of 1,465
==7265== at 0x483980B: malloc (vg_replace_malloc.c:309)
==7265== by 0x598A2AB: ralloc_size (ralloc.c:119)
==7265== by 0x598F861: _mesa_set_create (set.c:127)
==7265== by 0x599079D: _mesa_pointer_set_create (set.c:570)
==7265== by 0x58BD7D1: build_program_resource_list(gl_context*, gl_shader_program*, bool) (linker.cpp:4026)
==7265== by 0x548231B: st_link_shader (st_glsl_to_ir.cpp:170)
==7265== by 0x54DA269: _mesa_glsl_link_shader (ir_to_mesa.cpp:3119)
Fixes: a6aedc66 ("st/glsl_to_nir: use nir based program resource list builder")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3574>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3574>
This adds support for OpVariable having an initializer that points to
another variable, rather than a constant. In this case, the variable is
initialized to a pointer to the other variable.
Fixes Vulkan CTS tests:
dEQP-VK.spirv_assembly.instruction.compute.variable_init.private.*
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3047>
Add a pointer_initializer field to nir_variable analogous to
constant_initializer, which can be used to initialize the nir_variable
to a pointer to another nir_variable. Just like the
constant_initializer, the pointer_initializer gets eliminated in the
nir_lower_constant_initializers pass.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3047>
r600 reads vec4 from the UBO, but the offsets in nir are evaluated to the component.
If the offsets are not literal then all non-vec4 reads must resolve the component
after reading a vec4 component (TODO: figure out whether there is a consistent way
to deduct the component that is actually read).
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3225>
Since the nir default implementation doesn't support vectorizing the VS
inputs and FS outputs, additional lowering passes are added here to do
just that. The work is based on the Timothy Arceri's related work.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3225>
This means you can directly use format utils on it without having to have
your own GL enum to number-of-components switch statement (or whatever) in
your vulkan backend.
Thanks to imirkin for fixing up the nouveau driver (and a couple of core
details).
This fixes the computed qualifiers for EXT_shader_image_load_store's
non-integer sizeNxM qualifiers, which we don't have tests for.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v3d)
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3355>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3355>
Midgard can't write depth and stencil separately. It has to happen in
a single store operation containing both. Let's add a panfrost specific
intrinsic and turn all depth/stencil stores into a packed depth+stencil
one.
Note that this intrinsic is not yet handled in emit_intrinsic(), but
we'll address that later.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3697>
Using LLVM 8 for ppc64el and 7 for s390x (which hits some coroutine
related issues with LLVM 8).
There are some test failures we need to ignore for now. Also, the
timeout needs to be bumped from the default 30s for some tests, because
they can take longer under emulation.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3643>
Clang has a warning about overloading virtuals that triggers when a
derived class defines a virtual function that's an overload of
function in the base class. This kind of thing:
struct chart; // let's pretend this exists
struct Base
{
virtual void* get(char* e);
};
struct Derived: public Base {
virtual void* get(chart* e); // typo, we wanted to override the same function
};
The solution is to use
using Base::get;
to be explicit about the intention to reuse the base class virtual.
We hit this a lot with out glsl ir_hierarchical_visitor visitor
pattern, so let's adds some 'using' to calm down the compiler.
See-also: https://stackoverflow.com/questions/18515183/c-overloaded-virtual-function-warning-by-clang)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3686>
In a NIR generated using SPIR-V initializers to variables, copy
propagation can end up transforming
vec1 32 ssa_33 = deref_var &@1 (shared mat2x4)
vec1 32 ssa_35 = mov ssa_33
vec1 32 ssa_7 = deref_cast (mat2x4 *)ssa_35 (shared mat2x4) /* ptr_stride=0 */
into
vec1 32 ssa_33 = deref_var &@1 (shared mat2x4)
vec1 32 ssa_7 = deref_cast (mat2x4 *)ssa_33 (shared mat2x4) /* ptr_stride=0 */
Before the optimization, the "head" of a path of deref that uses ssa_7
will be the cast. After, it will be the variable in ssa_33. Since
the types are the same, this is a trivial cast that would be picked up
by nir_opt_deref.
If we need to compare such deref-chain after optimization with another
deref-chain for the same variable, the compare function will get
confused by the cast in the middle.
One alternative would be to add nir_opt_deref to places that compare
derefs, but that might not scale well, so skip the trivial casts when
generating the paths instead.
Motivated by the discussion in
https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3047#note_383660.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3420>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3420>
This introduces a new NIR intrinsic for loading inputs at a specific
vertex index.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>
From the SPV_AMD_shader_explicit_vertex_parameter extension:
"Returns the value of the input <interpolant> without any
interpolation, i.e. the raw output value of previous shader
stage."
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>
We need the LINEAR versions for AMD_shader_explicit_vertex_parameter.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>
This introduces one more interpolation mode INTERP_MODE_EXPLICIT,
which is needed for AMD_shader_explicit_vertex_parameter.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>
A few hash_table users roll their own integer hash functions which
call _mesa_hash_data to perform the hashing which ultimately calls
into XXH32 with a dynamic key length. When using small keys with a
constant size the hash rate can be greatly improved by inlining
XXH32 and providing it a constant key length, see:
https://fastcompression.blogspot.com/2018/03/xxhash-for-small-keys-impressive-power.html
Additionally, this patch removes calls to _mesa_key_hash_string and
makes them instead call _mesa_has_string directly, matching the new
integer hash functions.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3475>
These instructions are allowed to fetch from multisampled
subpass input attachments.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304>
nir_tex_src_ms_index is re-used for the fragment index with
nir_texop_fragment_fetch to avoid introducing a new texture source type.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304>