fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-24 06:18:10 +02:00

Author	SHA1	Message	Date
Marek Olšák	3aa72a394a	nir/serialize: store 32-bit object IDs instead of 64-bit That means we have only 30 bits for object IDs, because 2 bits are sometimes used for something else. This decrease the uncompressed shader size for the biggest Borderlands 2 shader from 33.6 KB to 23.2 KB. (31% decrease) Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-05 23:35:31 -05:00
Marek Olšák	d5768fcd45	nir/serialize: don't expand 16-bit variable state slots to 32 bits the swizzle also needs only 16 bits Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-05 23:35:31 -05:00
Marek Olšák	96e6ef80d9	nir: pack the rest of nir_variable::data Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-05 23:32:34 -05:00
Marek Olšák	4319cc8c0f	nir: pack nir_variable::data::xfb_* Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-04 18:17:34 -05:00
Marek Olšák	08dc541b66	nir: pack nir_variable::data::stream Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-04 18:17:34 -05:00
Ian Romanick	9be4a422a0	nir/algebraic: Mark other comparison exact when removing a == a This prevents some additional optimizations that would change the original result. This includes things like (b < a && b < c) => b < min(a, c) and !(a < b) => b >= a. Both of these optimizations were specifically observed in the piglit tests added in piglit!160. This was discovered while investigating https://gitlab.freedesktop.org/mesa/mesa/issues/1958. However, the problem in that issue was Chrome or Angle is replacing calls to isnan() with some stuff that we (correctly) optimize to false. If they had left the calls to isnan() alone, everything would have just worked. No shader-db changes on any Intel platform. I also tried marking the comparison generated by the isnan() function precise. The precise marker "infects" every computation involved in calculating the parameter to the isnan() function, and this severely hurt all of the (few) shaders in shader-db that use isnan(). I also considered adding a new ir_unop_isnan opcode that would implement the functionality. During GLSL IR-to-NIR translation, the resulting comparison operation would be marked exact (and the samething would need to happen in SPIR-V translation). This approach taken by this patch seemed easier, but we may want to do the ir_unop_isnan thing anyway. Fixes: `d55835b8bd` ("nir/algebraic: Add optimizations for "a == a && a CMP b"") Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-04 14:05:49 -08:00
Ian Romanick	ea19f2fb68	nir/algebraic: Add the ability to mark a replacement as exact Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-04 14:05:49 -08:00
Marek Olšák	af94600484	compiler: make variable::data::binding unsigned Nothing seems to set a negative value. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-04 16:49:46 -05:00
Dylan Baker	717606f9f3	nir: correct use of identity check in python Python has the identity operator `is`, and the equality operator `==`. Using `is` with strings sometimes works in CPython due to optimizations (they have some kind of cache), but it may not always work. Fixes: `96c4b135e3` ("nir/algebraic: Don't put quotes around floating point literals") Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-11-04 16:06:39 +00:00
Tapani Pälli	b380d47998	nir: fix couple of compile warnings Fixes "warning: braces around scalar initializer" warnings. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 00:21:44 +00:00
Timothy Arceri	7f106a2b5d	util: rename list_empty() to list_is_empty() This makes it clear that it's a boolean test and not an action (eg. "empty the list"). Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-10-28 11:24:38 +00:00
Dylan Baker	09ee11f5da	nir: Fix invalid code for MSVC Fixes: `ee2050b111` ("nir: Use BITSET for tracking varyings in lower_io_arrays")	2019-10-25 22:47:32 +00:00
Kenneth Graunke	f306d07932	nir: Use VARYING_SLOT_TESS_MAX to size indirect bitmasks MAX_VARYINGS_INCL_PATCH subtracts VARYING_SLOT_VAR0 giving us a size that's too small, so BITSET_SET writes words out of bounds, corrupting the stack and causing all kinds of chaos. VARYING_SLOT_TESS_MAX is the right value to use here, as it's the largest location. Closes: 2002 Fixes: `ee2050b111` ("nir: Use BITSET for tracking varyings in lower_io_arrays") Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-25 13:29:09 -07:00
Kristian H. Kristensen	ee2050b111	nir: Use BITSET for tracking varyings in lower_io_arrays MAX_VARYINGS_INCL_PATCH is greater than 64, so we'll need more that 64 bits (per component) to track which vars have indirects. This pass was trying to track patch varyings (which start at bit 63) in a separate 64 bit word, but failed to subtract VARYING_SLOT_PATCH0 and accessed out of bounds. Do away with the ad-hoc bit mask tracking and just use a BITSET. Fixes: dEQP-GLES31.functional.tessellation.user_defined_io.per_patch_block.vertex_io_array_size_implicit.triangles Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-24 15:32:20 -07:00
Caio Marcelo de Oliveira Filho	901071044e	nir/tests: Add copy propagation tests with scoped_memory_barrier Three groups of tests, effectively defining what cases the optimization is allowed or prevented - Redudant loads (a load generated the value) - Propagate SSA values (a store generated the value) - Propagate a var (a copy generated the value) Change the shader type of the tests to be COMPUTE so nir_var_mem_shared can also be used. Doesn't affect the semantic of the copy propagation. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho	73572abc2a	nir: Add scoped_memory_barrier intrinsic Add a NIR instrinsic that represent a memory barrier in SPIR-V / Vulkan Memory Model, with extra attributes that describe the barrier: - Ordering: whether is an Acquire or Release; - "Cache control": availability ("ensure this gets written in the memory") and visibility ("ensure my cache is up to date when I'm reading"); - Variable modes: which memory types this barrier applies to; - Scope: how far this barrier applies. Note that unlike in SPIR-V, the "Storage Semantics" and the "Memory Semantics" are split into two different attributes so we can use variable modes for the former. NIR passes that took barriers in consideration were also changed - nir_opt_copy_prop_vars: clean up the values for the mode of an ACQUIRE barrier. Copy propagation effect is to "pull up a load" (by not performing it), which is what ACQUIRE restricts. - nir_opt_dead_write_vars and nir_opt_combine_writes: clean up the pending writes for the modes of an RELEASE barrier. Dead writes effect is to "push down a store", which is what RELEASE restricts. - nir_opt_access: treat the ACQUIRE and RELEASE as a full barrier for the modes. This is conservative, but since this is a GL-specific pass, doesn't make a difference for now. v2: Fix the scoped barrier handling in copy propagation. (Jason) Add scoped barrier handling to nir_opt_access and nir_opt_combine_writes. (Rhys) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-24 11:39:55 -07:00
Timothy Arceri	922801b77d	nir: improve nir_variable packing Before: /* size: 136, cachelines: 3, members: 10 / After: / size: 128, cachelines: 2, members: 10 */ Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-10-24 13:24:40 +11:00
Timothy Arceri	c412ff426b	nir: fix nir_variable_data packing Before: /* size: 60, cachelines: 1, members: 29 / After: / size: 56, cachelines: 1, members: 29 */ Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-10-24 13:22:59 +11:00
Marek Olšák	28199aeee5	st/mesa: assign driver locations for VS inputs for NIR before caching fix up edge flags in the NIR pass, because st/mesa doesn't touch the inputs after caching Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-23 21:12:52 -04:00
Erik Faye-Lund	acf1bf47cc	Revert "nir: drop support for using load_alpha_ref_float" This reverts commit `5af272b474`. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jose Maria Casanova <jmcasanova@igalia.com>	2019-10-23 13:03:52 +02:00
Erik Faye-Lund	beb6639a9d	Revert "nir: drop unused alpha_ref_float" This reverts commit `e8095f2af0`. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jose Maria Casanova <jmcasanova@igalia.com>	2019-10-23 13:03:38 +02:00
Marek Olšák	a0b711d8e9	nir: allow nir_lower_uniforms_to_ubo to be run repeatedly for st/mesa Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-22 14:41:23 -04:00
Rhys Perry	8b98d0954e	nir/lower_idiv: add new llvm-based path v2: make variable names snake_case v2: minor cleanups in emit_udiv() v2: fix Panfrost build failure v3: use an enum instead of a boolean flag in nir_lower_idiv()'s signature v4: remove nir_op_urcp v5: drop nv50 path v5: rebase v6: add back nv50 path v6: add comment for nir_lower_idiv_path enum v7: rename _nv50/_llvm to _fast/_precise v8: fix etnaviv build failure Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-21 18:49:46 +00:00
Rob Clark	5e08f070f0	nir: add nir_lower_amul pass Lower amul to either imul or imul24, depending on whether 24b is enough bits to calculate an offset within the thing being dereferenced. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-10-18 15:08:54 -07:00
Rob Clark	1bdde31392	nir: add address calc related opt rules Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2019-10-18 15:08:54 -07:00
Rob Clark	6320e37d4b	nir: add amul instruction Used for address/offset calculation (ie. array derefs), where we can potentially use less than 32b for the multiply of array idx by element size. For backends that support `imul24`, this gives a lowering pass an easy way to find multiplies that potentially can be converted to `imul24`. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2019-10-18 15:08:54 -07:00
Rob Clark	0568761f8e	nir: Add a new ALU nir_op_imul24 Some hardware can do 24b multiply in a single instruction, but not 32b. However in most cases 24b is sufficient for address/offset calculation. Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2019-10-18 15:08:54 -07:00
Eduardo Lima Mitev	32e5fbf47c	nir: Add a new ALU nir_op_imad24_ir3 ir3 compiler has a signed integer multiply-add instruction (MAD_S24) that is used for different offset calculations in the backend. Since we intend to move some of these calculations to NIR, we need a new ALU op that can directly represent it. Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2019-10-18 15:08:54 -07:00
Rob Clark	ad8167c1e0	nir/search: fix the PoT helpers Otherwise, if the base type is (for example) uint32, we would incorrectly think that PoT optimizations could not apply. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Jason Ekstsrand <jason@jleksrand.net> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2019-10-18 15:08:54 -07:00
Eduardo Lima Mitev	f1d4fadf1b	nir: Add new texop nir_texop_tex_prefetch This is like nir_texop_tex, but signals that the sampling coordinates are immutable during the shader stage, in a way that allows the HW that supports pre-dispatching sampling operations to pre-fetch the result prior to scheduling the shader stage. This is introduced to support the feature in Freedreno. Adreno HW from a4xx supports it. A NIR pass introduced later in this series will detect sampling operations that are eligible for pre-dispatch, and replace nir_texop_tex by this new op, to tell the backend to enable pre-fetch. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Ian Romanick	050e4e28bf	nir/search: Fix possible NULL dereference in is_fsign Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Fixes: `09705747d7` ("nir/algebraic: Reassociate fadd into fmul in DPH-like pattern")	2019-10-17 15:07:01 -07:00
Kristian H. Kristensen	8e16fb1528	freedreno/ir3: Implement lowering passes for VS and GS This introduces two new lowering passes. One to lower VS to explicit outputs using STLW and one to lower GS to load input using LDLW and implement the GS specific functionality. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Kristian H. Kristensen	0324706764	freedreno/ir3: Add intrinsics that map to LDLW/STLW These intrinsics will let us do all the offset calculations in nir, which is nicer to work with and lets nir_opt_algebraic eat it all up. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Erik Faye-Lund	e8095f2af0	nir: drop unused alpha_ref_float Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-17 10:41:36 +02:00
Erik Faye-Lund	5af272b474	nir: drop support for using load_alpha_ref_float Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-17 10:41:36 +02:00
Erik Faye-Lund	71c0dcf266	nir: support feeding state to nir_lower_clip_[vg]s Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-17 10:41:36 +02:00
Erik Faye-Lund	eb3047c094	nir: support lowering clipdist to arrays This allows us to make sure clipdist is emitted as a scalar array rather than two vec4s. This matches SPIR-V semantics, and will be useful for Zink. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-17 10:41:36 +02:00
Erik Faye-Lund	011d692a52	nir: support derefs in two-sided lighting lowering Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-17 10:41:36 +02:00
Erik Faye-Lund	878c94288a	nir: add lowering-pass for point-size mov Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-17 10:41:36 +02:00
Erik Faye-Lund	6d7e02e37d	nir: allow passing alpha-ref state to lowering-code Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-17 10:41:36 +02:00
Dave Airlie	dc91a02a72	nir: add a pass to lower flat shading. This takes any color or backcolor that has unspecified shading and converts it to flat shading. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-17 10:41:36 +02:00
Neil Roberts	0832845dc6	nir/builtin: Add extern "C" guards to nir_builtin_builder.h That way it can also be included from a C++ source. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-12 09:43:18 +02:00
Neil Roberts	9eaeedd54b	nir/builtin: Add #include u_math.h to the header The inline functions use M_PI so they should include a header to make sure it is defined. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-12 09:43:18 +02:00
Neil Roberts	2098ae16c8	nir/builder: Move nir_atan and nir_atan2 from SPIR-V translator Moves build_atan and build_atan2 into nir_builtin_builder. The goal is to be able to use this from the GLSL translator too. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-12 09:43:17 +02:00
Bas Nieuwenhuizen	6da3bf2600	nir/dead_cf: Remove dead control flow after infinite loops. And after discard-only loops. Otherwise we end up with dead code which confuses nir_repair_ssa into adding a whole bunch of uses of undefined. However, for derefs, we sometimes always expect to get a variable instead of undefined. Fixes dEQP-VK.graphicsfuzz.write-red-in-loop-nest on radv. Fixes: `c832820ce9` "nir/dead_cf: Repair SSA if the pass makes progress" Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1928 Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-10-11 17:24:26 +02:00
Rhys Perry	599d634c2c	nir/lower_input_attachments: pass on non-uniform access flag Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-11 14:26:58 +00:00
Rhys Perry	5ef04d7982	nir/lower_non_uniform: lower image/texture instructions taking derefs v2: always assert on the texture/sampler handle's num_components v3: replicate the deref inside the loop v4: remove a case of useless line wrapping Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-11 14:26:58 +00:00
Marek Olšák	cebc38ff60	nir: add nir_shader_compiler_options::lower_to_scalar This will replace PIPE_SHADER_CAP_SCALAR_ISA. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-10 15:49:18 -04:00
Marek Olšák	e5209e6a95	nir/drawpixels: fix what appears to be a copy-paste bug in get_texcoord_const Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-10 15:49:18 -04:00
Marek Olšák	e621b30787	nir/drawpixels: handle load_color0, load_input, load_interpolated_input for radeonsi Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-10 15:49:18 -04:00

... 5 6 7 8 9 ...

2230 commits