fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-23 04:18:14 +02:00

Author	SHA1	Message	Date
Ian Romanick	ee1c69fadd	glsl: Don't increase the iteration count when there are no terminators Incrementing the iteration count was intended to fix an off-by-one error when the first terminator was superseded by a later terminator. If there is no first terminator or later terminator, there is no off-by-one error. Incrementing the loop count creates one. This can be seen in loops like: do { if (something) { // No breaks or continues here. } } while (false); Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Abel Briggs <abelbriggs1@hotmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110953 Fixes: `646621c66d` ("glsl: make loop unrolling more like the nir unrolling path")	2019-06-24 14:32:33 -07:00
Caio Marcelo de Oliveira Filho	6e2ff10886	glsl/nir: Fix copying 64-bit values in uniform storage The iterator `i` already walks the right amount now that is incremented by `dmul`, so no need to `* 2`. Fixes invalid memory access in upcoming ARB_gl_spirv tests. Failure bisected by Arcady Goldmints-Orlov. Fixes: `b019fe8a5b` "glsl/nir: Fix handling of 64-bit values in uniform storage" Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-24 11:32:14 -07:00
Caio Marcelo de Oliveira Filho	390ff8ac54	glsl/nir: Fix copying vector constant values For n_columns == 1, we have a vector which is handled by the else case. Fixes invalid memory access in upcoming ARB_gl_spirv tests. Failure bisected by Arcady Goldmints-Orlov. Fixes: `81e51b412e` "nir: Make nir_constant a vector rather than a matrix" Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-24 11:32:14 -07:00
Daniel Schürmann	a8b0b6e52b	nir: introduce lowering of bitfield_insert to bfm and a new opcode bitfield_select. bitfield_select is defined as: bitfield_select(mask, base, insert) = (mask & base) \| (~mask & insert) matching the behavior of AMD's BFI instruction. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-24 18:42:20 +02:00
Daniel Schürmann	1403c3a7bf	nir/algebraic: Use unsigned comparison when lowering bitfield insert/extract This lets us use the optimization pattern (('ult', 31, ('iand', b, 31)), False) to remove the bcsel instruction for code originating in D3D shaders. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-24 18:42:20 +02:00
Daniel Schürmann	4eeb49ea71	nir/algebraic: Remove unnecessary iand of [iu]bfe and bfm sources The [iu]bfe and bfm instructions are defined to only use the five least significant bits. This optimizes a common pattern from D3D -> SPIR-V translation. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-24 18:42:20 +02:00
Daniel Schürmann	165b7f3a44	nir: define behavior of nir_op_bfm and nir_op_u/ibfe according to SM5 spec. That is: the five least significant bits provide the values of 'bits' and 'offset' which is the case for all hardware currently supported by NIR and using the bfm/bfe instructions. This patch also changes the lowering of bitfield_insert/extract using shifts to not use bfm and removes the flag 'lower_bfm'. Tested-by: Eric Anholt <eric@anholt.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-24 18:42:20 +02:00
Daniel Schürmann	a74f256c58	nir/algebraic: add optimization pattern for ('ult', a, ('and', b, a)) and friends. These optimizations are based on the fact that 'and(a,b) <= umin(a,b)'. For AMD, this series moves the optimization from LLVM to NIR, so currently no vkpipeline-db changes here. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-06-24 18:42:20 +02:00
Boris Brezillon	56434450f6	nir/lower_tex: Add an assert() in nir_lower_txs_lod() We don't expect the output of a TXS instruction to be wider than a vec3. Add an assert() to make sure this never happens. Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-20 09:15:53 -07:00
Caio Marcelo de Oliveira Filho	12131096fa	spirv: Restrict use of descriptor intrinsics to Vulkan In ARB_gl_spirv we'll be able to use variables for uniform buffers, so don't use the descriptor intrinsics to lower the block access. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-19 22:07:51 -07:00
Jason Ekstrand	81e51b412e	nir: Make nir_constant a vector rather than a matrix Most places in NIR, we treat matrices like arrays. The one annoying exception to this has been nir_constant where a matrix is a first-class thing. This commit changes that so a matrix nir_constant is the same as an array nir_constant. This makes matrix nir_constants a tiny bit more expensive but shrinks all others by 96B. Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-06-19 21:05:54 +00:00
Jason Ekstrand	b019fe8a5b	glsl/nir: Fix handling of 64-bit values in uniform storage Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-06-19 21:05:54 +00:00
Jason Ekstrand	a54e397152	spirv: Only copy needed components for OpSpecConstantOp Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-06-19 21:05:54 +00:00
Jason Ekstrand	96bb9c9277	spirv: Use a single path for OpSpecConstantOp of OpVectorShuffle Now that nir_const_value is a scalar, there's no reason why we need multiple paths here and it's just extra paths to keep working. While we're here, we also add a vtn_fail_if check that component indices are in-bounds. Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-06-19 21:05:54 +00:00
Jason Ekstrand	280e5442e5	spirv: Use vtn_constan_uint() for array lengths and gather components Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-06-19 21:05:54 +00:00
Jason Ekstrand	aa11c2e75e	spirv: Add a vtn_constant_int helper Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-06-19 21:05:54 +00:00
Jason Ekstrand	93f4aa9889	glsl/types: Add a real is_integer helper Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-06-19 20:28:52 +00:00
Jason Ekstrand	f0920e266c	glsl/types: Rename is_integer to is_integer_32 It only accepts 32-bit integers so it should have a more descriptive name. This patch should not be a functional change. Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-06-19 20:28:52 +00:00
Jason Ekstrand	21a7e6d569	glsl/types: Ignore bit sizes in contains_integer() All of the callers for this function are looking at interpolation qualifiers and want to make sure they're declared flat. Any 64-bit integer inputs need to be flat. It's also makes the function make more sense since "integer" is fairly generic. Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-06-19 20:28:52 +00:00
Jason Ekstrand	0d1fb380b1	glsl/types: Handle all bit sizes in glsl_type_is_integer All of the callers of this function really just want to know if the type is an integer and don't care about bit size. Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-06-19 20:28:52 +00:00
Caio Marcelo de Oliveira Filho	feb0cdcb52	glsl/nir_opt_access: Update uniforms correctly when only vars change Even if only variables access flags are changed, the existing NIR infrastructure expects metadata to be explicitly preserved, so do that. Don't care about avoiding preserve to be called twice since the cost is negligible. This scenario can be triggered by dead variables, and also by other intrinsics that read the variables -- but not cause progress to be made when processing the intrinsics. Fixes: `f2d0e48ddc` "glsl/nir: Add optimization pass for access flags" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-19 12:50:41 -07:00
Caio Marcelo de Oliveira Filho	d7ea433a5f	glsl/nir: Fix getting the sampler dim when arrays are involved Unwrap any array in the variable type so we can get the sampler dim. This fixes piglit test spec/arb_arrays_of_arrays/execution/image_store/basic-imageStore-const-uniform-index.shader_test. Fixes: `f2d0e48ddc` "glsl/nir: Add optimization pass for access flags" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-19 12:50:39 -07:00
Connor Abbott	77be5b2f88	nir: Use reorderable access flag No changes with radeonsi shader-db. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-06-19 14:08:28 +02:00
Connor Abbott	a1c737927c	nir: Add a helper to determine if an intrinsic can be reordered This is simple now, but we're going to be adding a few more conditions to this later. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-06-19 14:08:28 +02:00
Connor Abbott	f2d0e48ddc	glsl/nir: Add optimization pass for access flags Right now, this just deduces when we can arbitrarily reorder SSBO and image loads, matching the existing logic in radeonsi's TGSI->LLVM pass. This approach can't handle some things that nir_opt_copy_prop_vars can, but it can handle images, and with GCM it lets us hoist reads outside of loops. We can also pass this information to LLVM which lets it do its own optimizations on it. This is GLSL only as I haven't tested it on Vulkan yet, and it would probably need a few changes to work there. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-06-19 14:08:28 +02:00
Connor Abbott	c813c5776d	nir: Add reorderable memory access enum Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-06-19 14:08:28 +02:00
Connor Abbott	75063fbac5	nir/copy_prop_vars: Ignore volatile accesses The spec explicitly says that volatile writes can't be removed and volatile reads do not guarantee that the same value will still be around after the read, as if there were a barrier after each read/write. Just ignore them. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-06-19 14:08:28 +02:00
Connor Abbott	364996d70d	glsl/nir: Propagate access qualifiers We were completely ignoring these before, except for putting them on variables. While we're here, don't set access qualifiers when converting to bindless since glsl_to_nir will already have set a more accurate qualifier that includes any qualifiers on struct members that are dereferenced. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-06-19 14:08:27 +02:00
Connor Abbott	6f20643b47	nir: Allow qualifiers on copy_deref and image instructions In the next commit, we'll properly handle access qualifiers on struct members by propagating them to load/store instructions, but these instructions had no way to specify the qualifier. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-06-19 14:08:27 +02:00
Connor Abbott	47e7c6961a	nir: add a vectorization pass This effectively does the opposite of nir_lower_alus_to_scalar, trying to combine per-component ALU operations with the same sources but different swizzles into one larger ALU operation. It uses a similar model as CSE, where we do a depth-first approach and keep around a hash set of instructions to be combined, but there are a few major differences: 1. For now, we only support entirely per-component ALU operations. 2. Since it's not always guaranteed that we'll be able to combine equivalent instructions, we keep a stack of equivalent instructions around, trying to combine new instructions with instructions on the stack. The pass isn't comprehensive by far; it can't handle operations where some of the sources are per-component and others aren't, and it can't handle phi nodes. But it should handle the more common cases, and it should be reasonably efficient. [Alyssa: Rebase on latest master, updating with respect to typeless moves] Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-18 06:43:30 -07:00
Boris Brezillon	296c5fd25d	nir/lower_tex: Add a way to lower TXS(non-0-LOD) instructions The V3D driver has an open-coded solution for this, and we need the same thing for Panfrost, so let's add a generic way to lower TXS(LOD) into max(TXS(0) >> LOD, 1). Changes in v2: * Use == 0 instead of ! * Rework the minification logic as suggested by Jason * Assign cursor pos at the beginning of the function * Patch the LOD just after retrieving the old value Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-18 06:36:07 -07:00
Boris Brezillon	0e489fd360	nir/lower_tex: Update ->sampler_dim value before calling get_texture_size() get_texture_size() will create a txs instruction with ->sampler_dim set to the original tex->sampler_dim. The condition to call lower_rect() only checks the value of ->sampler_dim and whether lower_rect is requested or not. This leads to an infinite loop when calling nir_lower_tex() with the same options until it returns false. In order to avoid that, let's move the tex->sampler_dim patching before get_texture_size() is called. This way the txs instruction will have ->sampler_dim set to GLSL_SAMPLER_DIM_2D and nir_lower_tex() won't try to lower it on the subsequent passes. Changes in v2: * Add Jason R-b * Add a comment explaining why we patch ->sampler_dim at the beginning of the lower_rect() func Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-18 06:36:07 -07:00
Boris Brezillon	352b1d9c31	nir/lower_tex: Actually report when projector lowering happened The code considers that projector lowering was done even if it's not really the case. Change the project_src() prototype to return a bool encoding whether projector lowering happened or not and update the progress var accordingly in nir_lower_tex_block(). --- Changes in v2: * Add Jason R-b * Drop the part suggesting that nir_lower_rect() could be called in a do-while(progress) loop. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-18 06:36:07 -07:00
Kenneth Graunke	3c10a2726b	glsl: Fix out of bounds read in shader_cache_read_program_metadata The VaryingNames array has NumVaryings entries. But BufferStride is a small array of MAX_FEEDBACK_BUFFERS (4) entries. Programs with more than 4 varyings would read out of bounds. Also, BufferStride is set based on the shader itself, which means that it's inherently already included in the hash, and doesn't need to be included again. At the point when shader_cache_read_program_metadata is called, the linker hasn't even set those fields yet. So, just drop it entirely. Fixes valgrind errors in KHR-GL45.transform_feedback.linking_errors_test. Fixes: `6d830940f7` glsl/shader_cache: Allow shader cache usage with transform feedback Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-06-17 21:22:19 -05:00
Neil Roberts	34d4b3e367	glsl: Set default precision on record members Record types have their own slot to store the precision for each member in glsl_struct_field. Previously if the member didn’t have an explicit precision qualifier this was being left as GLSL_PRECISION_NONE. This patch makes it take into account the type’s default precision qualifier like it does for regular variables in apply_type_qualifier_to_variable. This has the additional benefit of correctly reporting an error when a float type is used in a struct without declaring the default type. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-14 09:29:53 +02:00
Neil Roberts	235425771c	glsl/linker: Make precision matching optional in intrastage_match This function is confusingly also used to match interstage interfaces as well as intrastage. In the interstage case it needs to avoid comparing the precisions. This patch adds a parameter to specify whether to take the precision into account or not so that it can be used for both cases. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-14 09:29:53 +02:00
Neil Roberts	19b27a8569	glsl/linker: Don’t check precision for shader interface On GLES, the interface between vertex and fragment shaders doesn’t need to have matching precision. Section 4.3.10 of the GLSL ES 3.00 spec: “The type of vertex outputs and fragment inputs with the same name must match, otherwise the link command will fail. The precision does not need to match.” Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-14 09:29:53 +02:00
Neil Roberts	230d1e8d86	compiler/types: Making comparing record precision optional On GLES, the interface between vertex and fragment shaders doesn’t need to have matching precision. This adds an extra argument to glsl_types::record_compare to disable the precision comparison. This will later be used for the shader interface check. In order to make this work this patch also adds a helper function to recursively compare types while ignoring the precision. v2: Call record_compare from within compare_no_precision to avoid duplicating code (Eric Anholt). Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-14 09:29:53 +02:00
Iago Toral Quiroga	2a2501247b	nir: detect more dynamically uniform expressions Shader-db results for v3d: total instructions in shared programs: 9132728 -> 9119238 (-0.15%) instructions in affected programs: 596886 -> 583396 (-2.26%) helped: 1118 HURT: 224 total threads in shared programs: 234298 -> 234308 (<.01%) threads in affected programs: 10 -> 20 (100.00%) helped: 5 HURT: 0 total uniforms in shared programs: 3022949 -> 3022622 (-0.01%) uniforms in affected programs: 29163 -> 28836 (-1.12%) helped: 108 HURT: 37 total max-temps in shared programs: 1328030 -> 1327762 (-0.02%) max-temps in affected programs: 10097 -> 9829 (-2.65%) helped: 263 HURT: 15 total spills in shared programs: 3793 -> 3777 (-0.42%) spills in affected programs: 432 -> 416 (-3.70%) helped: 16 HURT: 0 total fills in shared programs: 4380 -> 4266 (-2.60%) fills in affected programs: 828 -> 714 (-13.77%) helped: 16 HURT: 0 Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-14 08:00:52 +02:00
Connor Abbott	37b92b0ae6	nir: Don't manually index intrinsic index enum This fixes a rebase fail in `ea51275e07`, and prevents it from happening again. There's no reason to do this manually. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-13 17:10:41 +02:00
Daniel Schürmann	7a858f274c	spirv/nir: add support for AMD_shader_ballot and Groups capability This commit also renames existing AMD capabilities: - gcn_shader -> amd_gcn_shader - trinary_minmax -> amd_trinary_minmax Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-13 12:44:23 +00:00
Daniel Schürmann	ea51275e07	nir: add intrinsics for AMD_shader_ballot Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-13 12:44:23 +00:00
Daniel Schürmann	1b89ebeede	nir/spirv: add support for the SubgroupBallotKHR SPIR-V capability This capability is required for the VK_EXT_shader_subgroup_ballot extension. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-13 12:44:23 +00:00
Daniel Schürmann	de56ebadce	nir/spirv: add support for the SubgroupVoteKHR SPIR-V capability This capability is required for the VK_EXT_shader_subgroup_vote extension. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-13 12:44:23 +00:00
Caio Marcelo de Oliveira Filho	2cb5907508	glsl: Check order and uniqueness of interlock functions With this commit all remaining compilation tests in Piglit for ARB_fragment_shader_interlock will pass. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2019-06-10 14:29:32 -07:00
Caio Marcelo de Oliveira Filho	b7c9fc72fd	glsl: Make interlock builtins follow same compiler rules as barriers Generalize the barrier code to provide correct error messages for other builtins. Fixes most of piglit compilation tests for ARB_fragment_shader_interlock. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2019-06-10 14:29:26 -07:00
Eduardo Lima Mitev	fb2169040a	nir/opt_algebraic: Fix rules for imadsh_mix16 The rules added in patch `3addd7c` are inverted: It should be: (al * bh) << 16 + c instead of: (ah * bl) << 16 + c Fixes a number of regressions under dEQP-GLES31.functional.draw_indirect.compute_interop.large.* on Freedreno. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-06-10 22:27:46 +02:00
Eric Engestrom	440fe0eb43	nir: fix s/&&/\|\|/ typo Fixes: `cd73b6174b` "nir/lower_to_source_mods: Stop turning add, sat, and neg into mov" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-07 16:06:25 +01:00
Eduardo Lima Mitev	3addd7c8d9	nir_algebraic: Add basic optimizations for umul_low and imadsh_mix16 For umul_low (al * bl), zero is returned if the low 16-bits word of either source is zero. for imadsh_mix16 (ah * bl << 16 + c), c is returned if either 'ah' or 'bl' is zero. A couple of nir_search_helpers are added: is_upper_half_zero() returns true if the highest word of all components of an integer NIR alu src are zero. is_lower_half_zero() returns true if the lowest word of all components of an integer nir alu src are zero. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-07 08:45:05 +02:00
Eduardo Lima Mitev	c27b3758fa	nir/opcodes: Add new 'umul_low' and 'imadsh_mix16' opcodes 'umul_low' is the low 32-bits of unsigned integer multiply. It maps directly to ir3's MULL_U. 'imadsh_mix16' is multiply add with shift and mix, an ir3 specific instruction that maps directly to ir3's IMADSH_M16. Both are necessary for the lowering of integer multiplication on Freedreno, which will be introduced later in this series. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-07 08:45:05 +02:00

1 2 3 4 5 ...

3820 commits