fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-21 11:08:24 +02:00

Author	SHA1	Message	Date
Ian Romanick	cfc0d34802	nir: See through an fneg to apply existing optimizations Doing the same for the existing feq and fne transformations didn't help anything in shader-db. shader-db results: Broadwell and Skylake (Skylake shown) total instructions in shared programs: 14529463 -> 14526147 (-0.02%) instructions in affected programs: 402420 -> 399104 (-0.82%) helped: 2136 HURT: 131 helped stats (abs) min: 1 max: 10 x̄: 1.61 x̃: 1 helped stats (rel) min: 0.03% max: 16.22% x̄: 3.14% x̃: 1.12% HURT stats (abs) min: 1 max: 2 x̄: 1.01 x̃: 1 HURT stats (rel) min: 0.13% max: 7.69% x̄: 0.75% x̃: 0.57% 95% mean confidence interval for instructions value: -1.51 -1.41 95% mean confidence interval for instructions %-change: -3.06% -2.78% Instructions are helped. total cycles in shared programs: 533146915 -> 533120531 (<.01%) cycles in affected programs: 10356261 -> 10329877 (-0.25%) helped: 1933 HURT: 844 helped stats (abs) min: 1 max: 490 x̄: 29.44 x̃: 16 helped stats (rel) min: <.01% max: 28.57% x̄: 3.43% x̃: 1.88% HURT stats (abs) min: 1 max: 423 x̄: 36.17 x̃: 12 HURT stats (rel) min: <.01% max: 23.75% x̄: 1.90% x̃: 0.59% 95% mean confidence interval for cycles value: -11.78 -7.22 95% mean confidence interval for cycles %-change: -1.98% -1.65% Cycles are helped. Haswell total instructions in shared programs: 9037416 -> 9034106 (-0.04%) instructions in affected programs: 389831 -> 386521 (-0.85%) helped: 2184 HURT: 120 helped stats (abs) min: 1 max: 11 x̄: 1.57 x̃: 1 helped stats (rel) min: 0.03% max: 25.00% x̄: 2.73% x̃: 1.02% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.19% max: 7.69% x̄: 0.81% x̃: 0.57% 95% mean confidence interval for instructions value: -1.49 -1.39 95% mean confidence interval for instructions %-change: -2.68% -2.41% Instructions are helped. total cycles in shared programs: 84636243 -> 84631628 (<.01%) cycles in affected programs: 4745058 -> 4740443 (-0.10%) helped: 1904 HURT: 960 helped stats (abs) min: 1 max: 466 x̄: 30.21 x̃: 18 helped stats (rel) min: 0.02% max: 36.36% x̄: 3.57% x̃: 2.38% HURT stats (abs) min: 1 max: 1080 x̄: 55.11 x̃: 14 HURT stats (rel) min: 0.02% max: 51.33% x̄: 2.77% x̃: 0.81% 95% mean confidence interval for cycles value: -4.51 1.29 95% mean confidence interval for cycles %-change: -1.64% -1.25% Inconclusive result (value mean confidence interval includes 0). LOST: 1 GAINED: 0 Sandy Bridge and Ivy Bridge (Ivy Bridge shown) total instructions in shared programs: 10018873 -> 10015456 (-0.03%) instructions in affected programs: 512820 -> 509403 (-0.67%) helped: 2268 HURT: 162 helped stats (abs) min: 1 max: 11 x̄: 1.62 x̃: 1 helped stats (rel) min: 0.03% max: 25.00% x̄: 2.47% x̃: 0.88% HURT stats (abs) min: 1 max: 4 x̄: 1.59 x̃: 1 HURT stats (rel) min: 0.09% max: 7.69% x̄: 0.86% x̃: 0.50% 95% mean confidence interval for instructions value: -1.46 -1.35 95% mean confidence interval for instructions %-change: -2.38% -2.12% Instructions are helped. total cycles in shared programs: 87538223 -> 87524771 (-0.02%) cycles in affected programs: 5435520 -> 5422068 (-0.25%) helped: 1916 HURT: 946 helped stats (abs) min: 1 max: 1392 x̄: 29.44 x̃: 18 helped stats (rel) min: <.01% max: 34.51% x̄: 3.34% x̃: 1.97% HURT stats (abs) min: 1 max: 633 x̄: 45.41 x̃: 11 HURT stats (rel) min: 0.02% max: 25.95% x̄: 2.41% x̃: 0.62% 95% mean confidence interval for cycles value: -7.34 -2.06 95% mean confidence interval for cycles %-change: -1.62% -1.26% Cycles are helped. LOST: 1 GAINED: 0 Iron Lake total instructions in shared programs: 7888446 -> 7886959 (-0.02%) instructions in affected programs: 331581 -> 330094 (-0.45%) helped: 1160 HURT: 97 helped stats (abs) min: 1 max: 10 x̄: 1.37 x̃: 1 helped stats (rel) min: 0.02% max: 9.68% x̄: 0.93% x̃: 0.43% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.17% max: 4.17% x̄: 0.37% x̃: 0.25% 95% mean confidence interval for instructions value: -1.25 -1.12 95% mean confidence interval for instructions %-change: -0.91% -0.75% Instructions are helped. total cycles in shared programs: 178130766 -> 178116996 (<.01%) cycles in affected programs: 12534564 -> 12520794 (-0.11%) helped: 1856 HURT: 187 helped stats (abs) min: 2 max: 202 x̄: 7.78 x̃: 4 helped stats (rel) min: <.01% max: 6.47% x̄: 0.28% x̃: 0.11% HURT stats (abs) min: 2 max: 26 x̄: 3.55 x̃: 2 HURT stats (rel) min: 0.01% max: 2.14% x̄: 0.08% x̃: 0.02% 95% mean confidence interval for cycles value: -7.41 -6.07 95% mean confidence interval for cycles %-change: -0.28% -0.22% Cycles are helped. GM45 total instructions in shared programs: 4858912 -> 4857887 (-0.02%) instructions in affected programs: 237565 -> 236540 (-0.43%) helped: 867 HURT: 57 helped stats (abs) min: 1 max: 10 x̄: 1.25 x̃: 1 helped stats (rel) min: 0.02% max: 9.38% x̄: 0.87% x̃: 0.43% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.16% max: 3.85% x̄: 0.34% x̃: 0.22% 95% mean confidence interval for instructions value: -1.18 -1.04 95% mean confidence interval for instructions %-change: -0.88% -0.71% Instructions are helped. total cycles in shared programs: 122189118 -> 122180816 (<.01%) cycles in affected programs: 8776418 -> 8768116 (-0.09%) helped: 1213 HURT: 166 helped stats (abs) min: 2 max: 202 x̄: 7.30 x̃: 4 helped stats (rel) min: <.01% max: 6.43% x̄: 0.25% x̃: 0.11% HURT stats (abs) min: 2 max: 26 x̄: 3.35 x̃: 2 HURT stats (rel) min: 0.01% max: 2.14% x̄: 0.06% x̃: 0.02% 95% mean confidence interval for cycles value: -6.78 -5.26 95% mean confidence interval for cycles %-change: -0.24% -0.18% Cycles are helped. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-01-30 15:40:14 -08:00
Timothy Arceri	9a2e085680	nir: add lower_all_io_to_temps flag This will be used for freedreno and vc4 which require all inputs and outputs to be copied to temps. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-31 09:14:08 +11:00
Timothy Arceri	3218756262	nir/st_glsl_to_nir: add param to disable splitting of inputs We need this because we will always copy fs outputs to temps and split the arrays, but do not want to do either of these with fs inputs as it is unnessisary and makes handling interpolateAt builtins difficult. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-31 09:14:08 +11:00
Timothy Arceri	09cd484d61	nir: partially revert `c2acf97fcc` `c2acf97fcc` changed the use of double_inputs_read to be inconsitent with its previous meaning. Here we re-enable the gather info code that was removed as the modified code from `c2acf97fcc` now uses the double_inputs member rather than double_inputs_read. This change allows us to use double_inputs_read with gallium drivers without impacting double_inputs which is used by i965. We also make use of the compiler option vs_inputs_dual_locations to allow for the difference in behaviour between drivers that handle vs inputs as taking up two locations for doubles, versus those that treat them as taking a single location. Reviewed-by: Karol Herbst <kherbst@redhat.com>	2018-01-30 09:08:47 +11:00
Timothy Arceri	5b8de4bdff	nir: add vs_inputs_dual_locations compiler option Allows nir drivers to either use a single or dual locations for vs double inputs. i965 uses dual locations for both OpenGL and Vulkan drivers, for now gallium OpenGL drivers only use a single location. The following patch will also make use of this option when calling nir_shader_gather_info(). Reviewed-by: Karol Herbst <kherbst@redhat.com>	2018-01-30 09:08:47 +11:00
Timothy Arceri	f63e05ae9e	compiler: tidy up double_inputs_read uses First we move double_inputs_read into a vs struct in the union, double_inputs_read is only used for vs inputs so this will save space and also allows us to add a new double_inputs field. We add the new field because `c2acf97fcc` changed the behaviour of double_inputs_read, and while it's no longer used to track actual reads in i965 we do still want to track this for gallium drivers. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 09:08:47 +11:00
Tapani Pälli	d0343bef66	nir: mark unused space in packed_tex_data This change cleans following scary warnings in valgrind output when disk cache is being written: ==6532== Uninitialised byte(s) found during client check request ==6532== at 0x14423FAD: blob_write_bytes (blob.c:152) ==6532== by 0x144240FB: blob_write_uint32 (blob.c:194) ==6532== by 0x144001A5: write_tex (nir_serialize.c:613) and later (loads of): ==6532== Use of uninitialised value of size 8 ==6532== at 0x62FCD9E: crc32_z (in /usr/lib64/libz.so.1.2.11) ==6532== by 0x13F65014: util_hash_crc32 (crc32.c:127) ==6532== by 0x13F5DABA: cache_put (disk_cache.c:947) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-29 08:11:22 +02:00
Samuel Pitoiset	d5e369ff8a	nir: add a 'const' qualifier to nir_ssa_def_components_read() To avoid compilation warnings and because this helper shouldn't update anything. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-12 12:25:17 +01:00
Dylan Baker	2083a14179	meson: Use dependencies for nir This creates two new internal dependencies, idep_nir_headers and idep_nir. The former encapsulates the generation of nir_opcodes.h and nir_builder_opcodes.h and adding src/compiler/nir as an include path. This ensures that any target that needs nir headers will have the includes and that the generated headers will be generated before the target is build. The second, idep_nir, includes the first and additionally links to libnir. This is intended to make it easier to avoid race conditions in the build when using nir, since the number of consumers for libnir and it's headers are quite high. Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-01-11 15:40:02 -08:00
Dylan Baker	4ccb981673	meson: Use consistent style for tests Don't use intermediate variables, use consistent whitespace. Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-01-11 15:40:02 -08:00
Ian Romanick	fd2f4f507f	nir: Silence unused parameter warnings In file included from src/compiler/nir/nir_opt_algebraic.c:4:0: src/compiler/nir/nir_search_helpers.h: In function ‘is_not_const’: src/compiler/nir/nir_search_helpers.h:118:59: warning: unused parameter ‘num_components’ [-Wunused-parameter] is_not_const(nir_alu_instr instr, unsigned src, unsigned num_components, ^~~~~~~~~~~~~~ src/compiler/nir/nir_search_helpers.h:119:29: warning: unused parameter ‘swizzle ’ [-Wunused-parameter] const uint8_t swizzle) ^~~~~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2018-01-10 07:21:11 -08:00
Rob Clark	ea0bbe8201	nir: add missing local_group_size intrinsic For GL_ARB_compute_variable_group_size Reported-by: Karol Herbst <karolherbst@gmail.com> Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-30 12:39:07 -05:00
Dave Airlie	0e8e7ccf9d	nir/linking: always set the used_across_stages/outputs_read bits If we don't remap and output this code would trample the outputs read bits. This fixes a regression in dEQP-VK.tessellation.shader_input_output.barrier Fixes: `1c9c42d16b` (nir: add varying component packing helpers) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-19 06:44:11 +10:00
Eric Anholt	0bead224fe	nir: Add a new lowering option to lower all txd to txl. VC5 requires that all txd are lowered in the shader. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-14 14:36:17 -08:00
Eric Anholt	b08b628994	nir: Fix interaction of GL_CLAMP lowering with texture offsets. We want the clamping of the coordinate to apply after the offset, so we need to do math to lower the offset out of the instruction. Fixes texwrap offset cases for GL_CLAMP with GL_NEAREST on vc5. Note: I moved the get_texture_size() verbatim, so that it was defined before use. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-14 14:36:17 -08:00
Timothy Arceri	cab5513b47	nir: fix shift for uint64_t Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-13 13:20:28 +11:00
Jason Ekstrand	bb1e6ff161	spirv: Add a prepass to set types on vtn_values This autogenerated pass will automatically find and set the type field on all vtn_values. This way we always have the type and can use it for validation and other checks. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-11 22:28:34 -08:00
James Legg	947470d10b	nir/opcodes: Fix constant-folding of bitfield_insert Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104119 CC: <mesa-stable@lists.freedesktop.org> CC: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-12-07 08:59:36 +00:00
Jose Maria Casanova Crespo	1f440d00d2	nir: Handle fp16 rounding modes at nir_type_conversion_op nir_type_conversion enables new operations to handle rounding modes to convert to fp16 values. Two new opcodes are enabled nir_op_f2f16_rtne and nir_op_f2f16_rtz. The undefined behaviour doesn't has any effect and uses the original nir_op_f2f16 operation. v2: Indentation fixed (Jason Ekstrand) v3: Use explicit case for undefined rounding and assert if rounding mode is used for non 16-bit float conversions (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Eduardo Lima Mitev	2af63683bc	nir: Populate conversion opcodes to 16-bit types This will include the following NIR ALU opcodes: * nir_op_i2i16 * nir_op_i2f16 * nir_op_u2u16 * nir_op_u2f16 * nir_op_f2i16 * nir_op_f2u16 * nir_op_f2f16 v2: Remove "from" 16-bit in commit subject (Topi Pohjolainen) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Jose Maria Casanova Crespo	d711445430	nir: Add rounding modes enum v2: Added comments describing each of the rounding modes. (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Eduardo Lima Mitev	5165e222d1	nir: Add support for 16-bit types (half float, int16 and uint16) v2: Renamed glsl_half_float_type() to glsl_float16_t_type(). (Jason Ekstrand) Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Signed-off-by: Eduardo Lima <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Jason Ekstrand	cfb81f58a0	nir: Add a vulkan_resource_reindex intrinsic This is required for being able to handle OpPtrAccessChain in SPIR-V where the base type of the incoming pointer requires us to add to the block index instead of the byte offset. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2017-12-05 22:01:54 -08:00
Timothy Arceri	d99c7e0ff1	nir: allow builin arrays to be lowered Galliums nir drivers expect this to be done. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-04 12:52:18 +11:00
Timothy Arceri	2bc49ac3e6	nir: add array lowering function that assumes there are no indirects The gallium glsl->nir pass currently lowers away all indirects on both inputs and outputs. This fuction allows us to lower vs inputs and fs outputs and also lower things one stage at a time as we don't need to worry about indirects on the other side of the shaders interface. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-04 12:52:18 +11:00
Timothy Arceri	2a35021bc6	nir: fix support for scalar arrays in nir_lower_io_types() This was just recreating the same vector type we alreay had and hitting an assert for scalars. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-04 09:10:30 +11:00
Timothy Arceri	1c9c42d16b	nir: add varying component packing helpers v2: update shader info input/output masks when pack components v3: make sure interpolation loc matches, this is required for the radeonsi NIR backend. v4: `33dca36f4f` fixed nir_gather_info to update outputs_read correct, make sure we also adjust this correctly when packing components. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v3)	2017-12-04 09:10:30 +11:00
Timothy Arceri	c797bc6aa7	nir: add varying array splitting pass V2: - fix matrix support, non-array matrices were being skipped in v1 v3: - handle lowering of tcs output loads correctly - correctly mark indirect locations for either in or out not both when processing a stage. - use nir_src_copy() when lowering stores. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-04 09:10:30 +11:00
Eric Engestrom	9d281e1506	compiler: fix typo Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-11-28 10:54:38 +00:00
Eric Engestrom	7b85b9b877	compiler: use NDEBUG to guard asserts nir_validate.c's #endif already had the correct NDEBUG comment Fixes: `dcb1acdea0` "nir/validate: Only build in debug mode" Fixes: `9ff71b649b` "i965/nir: Validate that NIR passes call nir_metadata_preserve()" Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-11-28 10:54:38 +00:00
Dave Airlie	33dca36f4f	nir: fill outputs_read field and add patch outputs read (v2) This is to be used for TCS optimisations on radv. v2: don't set written on reads (nha) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-11-27 13:50:03 +10:00
Ilia Mirkin	ab336e8b46	nir: allow texture offsets with cube maps GL doesn't have this, but some hardware supports it. This is convenient for lowering tg4 to plain texture calls, which is necessary on Adreno A4xx hardware. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2017-11-25 16:56:30 -05:00
Iago Toral Quiroga	a217cbd7ec	nir/gather_info: recognize load_patch_vertices_in as a system value This intrinsic is produced to load SYSTEM_VALUE_VERTICES_IN, which is generated to load gl_PatchVerticesIn in the SPIR-V path for both Vulkan and OpenGL. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-22 08:03:55 +01:00
Alex Smith	4122d00846	nir/spirv: tg4 requires a sampler Gather operations in both GLSL and SPIR-V require a sampler. Fixes gathers returning garbage when using separate texture/samplers (on AMD, was using an invalid sampler descriptor). Signed-off-by: Alex Smith <asmith@feralinteractive.com> Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-11-13 13:38:18 +00:00
Timothy Arceri	8c9f3f2c46	nir: add streams to nir data This will be used by gallium drivers. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-12 11:08:26 +11:00
Rob Clark	ef4c42fc3a	nir: handle get_buffer_size in nir_lower_atomics_to_ssbo Overlooked initially, be we need to remap the SSBO index for this as well. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-10 08:57:33 -05:00
Matt Turner	77a63d190a	nir: Don't print swizzles when there are more than 4 components ... as can happen with various types like mat4, or else we'll smash the stack writing past the end of components_local[]. Fixes: `5a0d3e1129` ("nir: Print the components referenced for split or packed shader in/outs.") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-11-08 13:22:26 -08:00
Jason Ekstrand	ad77775809	nir: Validate base types on array dereferences We were already validating that the parent type goes along with the child type but we weren't actually validating that the parent type is reasonable. This fixes that. Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:41:24 -08:00
Jason Ekstrand	ab9220edd6	nir,intel/compiler: Use a fixed subgroup size The GL_ARB_shader_ballot spec says that gl_SubGroupSizeARB is declared as a uniform. This means that it cannot change across an invocation such as a draw call or a compute dispatch. For compute shaders, we're ok because we only ever use one dispatch size. For fragment, however, the hardware dynamically chooses between SIMD8 and SIMD16 which violates the spec. Instead, let's just pick a subgroup size based on the shader stage. The fixed size we choose for compute shaders is a bit higher than strictly needed but there's no real harm in that. The advantage is that, if they do anything interesting with the value, NIR will see it as an immediate and can optimize better. Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	a026458020	nir/lower_subgroups: Lower ballot intrinsics to the specified bit size Ballot intrinsics return a bitfield of subgroups. In GLSL and some SPIR-V extensions, they return a uint64_t. In SPV_KHR_shader_ballot, they return a uvec4. Also, some back-ends would rather pass around 32-bit values because it's easier than messing with 64-bit all the time. To solve this mess, we make nir_lower_subgroups take a new parameter called ballot_bit_size and it lowers whichever thing it gets in from the source language (uint64_t or uvec4) to a scalar with the specified number of bits. This replaces a chunk of the old lowering code. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	8c2bf020fd	nir/builder: Add a nir_imm_intN_t helper This lets you easily build integer immediates of arbitrary bit size. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	9b35faba42	nir/lower_system_values: Lower SUBGROUP__MASK based on type The SUBGROUP__MASK system values are uint64_t when coming in from GLSL but uvec4 when coming in from SPIR-V. Lowering based on type allows us to nicely handle both. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	3ee91ee6ac	nir: Make ballot intrinsics variable-size This way they can return either a uvec4 or a uint64_t. At the moment, this is a no-op since we still always return a uint64_t. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	ad127afcfd	nir: Add a ssa_dest_init_for_type helper This would be useful a number of places Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	28da82f978	nir: Add a new subgroups lowering pass This commit pulls nir_lower_read_invocations_to_scalar along with most of the guts of nir_opt_intrinsics (which mostly does subgroup lowering) into a new nir_lower_subgroups pass. There are various other bits of subgroup lowering that we're going to want to do so it makes a bit more sense to keep it all together in one pass. We also move it in i965 to happen after nir_lower_system_values to ensure that because we want to handle the subgroup mask system value intrinsics here. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	295605c930	intel/cs: Push subgroup ID instead of base thread ID We're going to want subgroup ID for SPIR-V subgroups eventually anyway. We really only want to push one and calculate the other from it. It makes a bit more sense to push the subgroup ID because it's simpler to calculate and because it's a real API thing. The only advantage to pushing the base thread ID is to avoid a single SHL in the shader. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	80ddfab2f5	intel/cs: Rework the way thread local ID is handled Previously, brw_nir_lower_intrinsics added the param and then emitted a load_uniform intrinsic to load it directly. This commit switches things over to use a specific NIR intrinsic for the thread id. The one thing I don't like about this approach is that we have to copy thread_local_id over to the new visitor in import_uniforms. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Gwan-gyeong Mun	fb87c40a58	nir: fix a typo Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-11-06 18:11:24 -08:00
Dave Airlie	57372c5a42	nir/serialize: fix build with gcc 4.4.7 I had to build on RHEL6 today, and noticed this. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-03 15:03:35 +10:00
Timothy Arceri	440d08fe93	nir: skip lowering sampler if there is no dereference This avoids a crash on the output of nir_lower_bitmap(). Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:19:46 +11:00

... 58 59 60 61 62 ...

3670 commits