fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-17 18:18:06 +02:00

Author	SHA1	Message	Date
Jason Ekstrand	9e6a6ef0d4	nir/deref: Rematerialize parents in rematerialize_derefs_in_use_blocks When nir_rematerialize_derefs_in_use_blocks_impl was first written, I attempted to optimize things a bit by not bothering to re-materialize the sources of deref instructions figuring that the final caller would take care of that. However, in the case of more complex deref chains where the first link or two lives in block A and then another link and the load/store_deref intrinsic live in block B it doesn't work. The code in rematerialize_deref_in_block looks at the tail of the chain, sees that it's already in block B and skips it, not realizing that part of the chain also lives in block A. The easy solution here is to just rematerialize deref sources of deref instructions as well. This may potentially lead to a few more deref instructions being created by the conditions required for that to actually happen are fairly unlikely and, thanks to the caching, it's all linear time regardless. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109603 Fixes: `7d1d1208c2` "nir: Add a small pass to rematerialize derefs per-block" Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-02-11 10:57:23 -06:00
Ian Romanick	b031c64349	nir: Convert a bcsel with only phi node sources to a phi node v2: Remove the original ALU instruciton after all of its readers are modified to read the new ALU instruction. v3: Fix an issue where a bcsel that may not be executed on a loop iteration due to a break statement is converted to a phi (and therefore incorrectly "executed"). Noticed by Tim. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109216 Fixes: `8fb8ebfbb0` ("intel/compiler: More peephole select") Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-08 10:37:06 -08:00
Ian Romanick	0881e90c09	nir: Split ALU instructions in loops that read phis A single shader in Unigine Superposition is affected by this change. A single iadd is moved to the end of a loop. This iadd is involved in a complex set of logic to terminate the loop, and an extra mov instruction is inserted. This shader really needs the optimization suggested by bugzilla #94747, and I expect that to make this tiny regression go away. All Gen7+ platforms had similar results. (Skylake shown) total instructions in shared programs: 15047543 -> 15047545 (<.01%) instructions in affected programs: 565 -> 567 (0.35%) helped: 0 HURT: 2 total cycles in shared programs: 369977253 -> 369978253 (<.01%) cycles in affected programs: 127910 -> 128910 (0.78%) helped: 0 HURT: 2 v2: Skip nir_op_vec{2,3,4} and nir_op_[fi]mov instructions to avoid infinite optimization loops. Remove the original ALU instruciton after all of its readers are modified to read the new ALU instruction. v3: Extend to the more general case. The if the prev-block value from the phi is not undef, this means the ALU instruction has to be duplicated in both the prev-block and the continue-block. Fixes: `8fb8ebfbb0` ("intel/compiler: More peephole select") Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-08 10:37:06 -08:00
Ian Romanick	0c0c69729b	nir: Select phi nodes using prev_block instead of continue_block This simplifies some changes coming later. Fixes: `8fb8ebfbb0` ("intel/compiler: More peephole select") Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-08 10:37:06 -08:00
Ian Romanick	8d8f80af3a	nir: Refactor code that checks phi nodes in opt_peel_loop_initial_if This will be used in a couple more places soon. The function name is... horribly long. Neither Matt nor I could think of any thing that was shorter and still more descriptive than "is_phi_foo". I'm willing to entertain suggestions. Fixes: `8fb8ebfbb0` ("intel/compiler: More peephole select") Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-08 10:37:06 -08:00
Ian Romanick	4d65d2b12e	nir: Document some fields of nir_loop_terminator Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-08 10:37:06 -08:00
Ian Romanick	78169870e4	nir: Silence zillions of unused parameter warnings in release builds Fixes: `cd56d79b59` "nir: check NIR_SKIP to skip passes by name" Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-08 10:37:06 -08:00
Timothy Arceri	26aa460940	nir: rewrite varying component packing There are a number of reasons for the rewrite. 1. Adding support for packing tess patch varyings in a sane way. 2. Making use of qsort allowing the code to be much easier to follow. 3. Fixes a bug where different interp types caused component packing to be skipped for all varyings in some scenarios. 4. Allows us to add a crude live range analysis for deciding which components should be packed together. This support can optionally be added in a future patch. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-08 02:54:56 +00:00
Timothy Arceri	2f53260417	nir: add is_packing_supported_for_type() helper This will be used in the following patches to determine if we support packing the components of a varying. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-08 02:54:56 +00:00
Timothy Arceri	7b01d5c354	nir: add support for marking used patches when packing varyings This adds support needed for marking the varyings as used but we don't actually support packing patches in this patch. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-08 02:54:56 +00:00
Kenneth Graunke	15c6902117	nir: Avoid splitting compact arrays into per-element variables. Compact arrays are used for special variables like clip and cull distances, or tessellation levels. Drivers using compact arrays assume that these values will always be actual arrays. We don't want to turn a float[1] gl_CullDistance into a single float; that would confuse drivers. Today, i965 uses compact arrays, and Gallium drivers use nir_lower_io_arrays_to_elements, so we haven't had any overlap that would demonstrate the issue. Iris will use both. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-05 13:58:46 -08:00
Kenneth Graunke	ba9dcc80fb	nir: Avoid clip/cull distance lowering multiple times. A couple places in st/nir assume that cull distances have been lowered away, so it will need to call this lowering pass for drivers which opt out of the GLSL IR lowering. The Intel backend also calls this pass, for i965 and anv. We need to only do it once. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-05 13:58:46 -08:00
Kenneth Graunke	5730364d69	nir: Bail on clip/cull distance lowering if GLSL IR already did it. We have a GLSL IR pass to convert clip/cull distance float[] arrays into vec4[2] arrays. In `ff281e6204`, we attempted to skip this pass if the GLSL IR lowering had already run. But, that code was not quite right, as we forgot to strip away the per-vertex IO array layer for geometry and tessellation shader varyings. If the GLSL IR pass has run, the variables will not be marked as "compact". So we can simply check that and bail. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-05 13:58:46 -08:00
Kenneth Graunke	3327c93510	nir: Record info->fs.pixel_center_integer in lower_system_values radeonsi uses a system value for gl_FragCoord rather than an input var. These get translated into load_frag_coord NIR intrinsics, which lose the pixel_center_integer and origin_upper_left decorations. To cope with this, Tim added a shader_info field for pixel_center_integer, and made glsl_to_nir set it accordingly. prog_to_nir also needs to handle these fragcoord conventions. Instead of duplicating the logic to set the info field, just move it to nir_lower_system_values so it'll happen regardless of who makes the NIR. (For what it's worth, we don't need an info flag for origin_upper_left, because radeonsi lowers origin conventions in nir_lower_wpos_ytransform before nir_lower_system_values destroys the variable and qualifiers.) Reviewed-by: Eric Anholt <eric@anholt.net>	2019-02-05 13:51:52 -08:00
Jason Ekstrand	36734987a5	nir/deref: Drop zero ptr_as_array derefs They are effectively (&x)[0] or *&x which does nothing. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-02-05 15:17:19 -06:00
Jonathan Marek	4f0a3c9f9e	nir: add missing vec opcodes in lower_bool_to_float Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-05 15:34:15 +00:00
Caio Marcelo de Oliveira Filho	51547bbc5a	nir: keep the phi order when splitting blocks All things being equal is better to keep the original order. Since the new block is empty, push the phis in order to tail. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>	2019-02-04 20:41:13 -08:00
Matt Turner	9de90caca8	nir: Optimize double-precision lower_round_even() Use the trick of adding and then subtracting 2**52 (52 is the number of explicit mantissa bits a double-precision floating-point value has) to implement round-to-even. Cuts the number of instructions on SKL of the piglit test fs-roundEven-double.shader_test from 109 to 21. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-01-29 15:02:23 -08:00
Jason Ekstrand	9e34781aef	nir: Allow SSBOs and global to alias Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-26 13:41:50 -06:00
Jason Ekstrand	9839ce8bf9	nir/validate: Allow array derefs of vectors for nir_var_mem_global Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-01-26 13:39:18 -06:00
Jason Ekstrand	5f5503d498	nir/lower_io: Add support for nir_var_mem_global Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-01-26 13:39:18 -06:00
Jason Ekstrand	314d2c90c3	nir/lower_io: Add a 32 and 64-bit global address formats These are simple scalar addresses. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-26 13:39:18 -06:00
Jason Ekstrand	e461926ef2	nir: Add load/store/atomic global intrinsics These correspond roughly to reading/writing OpenCL global pointers. The idea is that they just take a bare address and load/store from it. Of course, exactly what this address means is driver-dependent. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-01-26 13:39:18 -06:00
Jason Ekstrand	39925d60ec	anv: Add pipeline cache support for xfb_info Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-22 10:42:56 -06:00
Alejandro Piñeiro	6b50b0a4a8	nir/xfb: distinguish array of structs vs array of blocks Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-22 10:42:56 -06:00
Jason Ekstrand	ac704e777c	nir/xfb: Properly handle arrays of blocks Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-01-22 10:42:56 -06:00
Alejandro Piñeiro	5649a0a6e8	nir/xfb: don't assert when xfb_buffer/stride is present but not xfb_offset In order to allow nir_gather_xfb_info to be used on OpenGL, specifically ARB_gl_spirv. So, from OpenGL 4.6 spec, section 11.1.2.1, "Output Variables": "outputs specifying both an XfbBuffer and an Offset are captured, while outputs not specifying both of these are not captured. Values are captured each time the shader writes to such a decorated object." This implies that are captured if both are present, and not if one of those are lacking. Technically, it doesn't explicitly point that having just one or the other is a mistake. In some cases, glslang is adding some extra XfbBuffer without XfbOffset around, and mentioning that technically that is not a bug (see issue#1526) And for the case of Vulkan, as the same glslang issue mentions, it is not clear if that should be a mistake or not. But even if it is a mistake, it is not really needed to be checked on the driver, and we can let the validation layers to check that. v2: simplify explicit_xfb_buffer and explicit_offset checks (Jason). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-22 10:42:56 -06:00
Jason Ekstrand	4f99ac9144	nir/xfb: Fix offset accounting for dvec3/4 Before, we were double-counting the component slots when we had a dvec3 or dvec4. Instead, just add them in once and manually offset the recorded output offset. Fixes: `19064b8c` "nir: Add a pass for gathering transform feedback info" Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-01-22 10:42:56 -06:00
Jason Ekstrand	96fa23bca5	nir: Preserve offsets in lower_io_to_scalar_early Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-01-22 10:42:56 -06:00
Samuel Pitoiset	b2bbd978d0	nir: fix lowering arrays to elements for XFB outputs If we have a transform feedback output like: float[2] x2_out (VARYING_SLOT_VAR1.x, 0, 0) which is lowered by nir_lower_io_arrays_to_elements to, float x2_out (VARYING_SLOT_VAR1.x, 0, 0) float x2_out@5 (VARYING_SLOT_VAR2.x, 0, 0) We have to update the destination offset to avoid overwriting the same value. v2 (Jason Ekstrand): - Compute the correct offsets for arrays of vectors and/or doubles Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-01-22 10:42:56 -06:00
Samuel Pitoiset	9f4e0aa7c1	nir: do not remove varyings used for transform feedback When a xfb buffer is explicitely declared on a varying variable, we shouldn't remove it at link time. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-22 10:42:56 -06:00
Jason Ekstrand	ca8c6c9781	nir: Mark deref UBO and SSBO access as non-scalar Fixes: `63b9aa2e25` "spirv: Add support for using derefs for..." Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-21 18:41:47 -06:00
Karol Herbst	8bb46de08b	mesa: add MESA_SHADER_KERNEL used for CL kernels Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-21 20:36:41 +01:00
Karol Herbst	0a793c78a3	nir: add bit_size parameter to system values with multiple allowed bit sizes v2: add assert to verify we have at least one valid bit_size v3: fix use of load_front_face in nir_lower_two_sided_color and tgsi_to_nir Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-21 00:17:18 +01:00
Karol Herbst	4125211e9c	nir: add legal bit_sizes to intrinsics With OpenCL some system values match the address bits, but in GLSL we also have some system values being 64 bit like subgroup masks. With this it is possible to adjust the builder functions so that depending on the bit_sizes the correct bit_size is used or an additional argument is added in case of multiple possible values. v2: validate dest bit_size v3: generate hex values in python code remove useless imports rename and move bit_sizes v4: add 1 to legal bit_sizes for front_face Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-21 00:16:51 +01:00
Karol Herbst	27bd07e230	nir/validate: allow to check against a bitmask of bit_sizes Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-21 00:16:51 +01:00
Karol Herbst	acdad24585	nir/spirv: handle SpvStorageClassCrossWorkgroup v2: rename nir_var_global to nir_var_mem_global Signed-off-by: Karol Herbst <kherbst@redhat.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-19 20:01:42 +01:00
Karol Herbst	36a76b7192	nir: rename nir_var_shared to nir_var_mem_shared Signed-off-by: Karol Herbst <kherbst@redhat.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-19 20:01:41 +01:00
Karol Herbst	6fefd69724	nir: rename nir_var_ssbo to nir_var_mem_ssbo Signed-off-by: Karol Herbst <kherbst@redhat.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-19 20:01:41 +01:00
Karol Herbst	3afc1e068f	nir: rename nir_var_ubo to nir_var_mem_ubo Signed-off-by: Karol Herbst <kherbst@redhat.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-19 20:01:41 +01:00
Karol Herbst	9b24028426	nir: rename nir_var_function to nir_var_function_temp Signed-off-by: Karol Herbst <kherbst@redhat.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-19 20:01:41 +01:00
Karol Herbst	e5daef9587	nir: rename nir_var_private to nir_var_shader_temp Signed-off-by: Karol Herbst <kherbst@redhat.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-19 20:01:41 +01:00
Caio Marcelo de Oliveira Filho	cd56d79b59	nir: check NIR_SKIP to skip passes by name Passes' function names, separated by comma, listed in NIR_SKIP environment variable will be skipped in debug mode. The mechanism is hooked into the _PASS macro, like NIR_PRINT. The extra macro NIR_SKIP is available as a developer convenience, to skip at pointer other than the passes entry points. v2: Fix typo in NIR_SKIP macro. (Bas) Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-18 12:31:49 -08:00
Bas Nieuwenhuizen	8424cd8fbd	nir: Account for atomics in copy propagation. Otherwise writes get propagated across atomics if no barrier is used. Without barrier writes should still be visible in the same invocation, so an atomic has to be considered a write. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Fixes: `b3c6146925` "nir: Copy propagation between blocks" Fixes: `62332d139c` "nir: Add a local variable-based copy propagation pass"	2019-01-18 00:55:35 +01:00
Jason Ekstrand	2d2737dcfe	nir: Add a bool to float32 lowering pass From @jekstrand's nir-1-bit-bool branch, with improved ior/inot lowering. ior: fmax instead of fadd allows removing the fsat. inot: seq(x, 0) can be better than fsub(1, x). On a2xx, it works better with the scalar instruction set. Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-01-14 19:27:06 +00:00
Caio Marcelo de Oliveira Filho	9fdded0cc3	src/compiler: use new hash table and set creation helpers Replace calls to create hash tables and sets that use _mesa_hash_pointer/_mesa_key_pointer_equal with the helpers _mesa_pointer_hash_table_create() and _mesa_pointer_set_create(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Eric Engestrom <eric@engestrom.ch>	2019-01-14 10:49:28 -08:00
Jason Ekstrand	821b6861ec	nir/gcm: Support deref instructions Even though no one's been brave enough to ever use this pass, I like to keep it functionally working. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-12 17:55:49 -06:00
Rhys Perry	0210243923	nir: fix copy-paste error in nir_lower_constant_initializers Fixes: `393b59e077` ('nir: Rework nir_lower_constant_initializers() to handle functions') Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-10 10:51:52 -06:00
Matt Turner	2623653126	nir: Unset metadata debug bit if no progress made NIR metadata validation verifies that the debug bit was unset (by a call to nir_metadata_preserve) if a NIR optimization pass made progress on the shader. With the expectation that the NIR shader consists of only a single main function, it has been safe to call nir_metadata_preserve() iff progress was made. However, most optimization passes calculate progress per-function and then return the union of those calculations. In the case that an optimization pass makes progress only on a subset of the functions in the shader metadata validation will detect the debug bit is still set on any unchanged functions resulting in a failed assertion. This patch offers a quick solution (short of a larger scale refactoring which I do not wish to undertake as part of this series) that simply unsets the debug bit on unchanged functions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-09 16:42:40 -08:00
Matt Turner	e633fae5cb	nir: Add lowering support for 64-bit operations to software Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-09 16:42:40 -08:00

1 2 3 4 5 ...

1305 commits