fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-21 00:18:09 +02:00

Author	SHA1	Message	Date
Jason Ekstrand	85685cf932	intel/lower_mem_access_bit_sizes: Compute alignments automatically Because dup_mem_intrinsic() retains the SSA offset from the original intrinsic and only modifies it by adding a constant, we can compute the alignment based on the original alignment and the constant offset. This is both easier and more accurate. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19580>	2022-11-09 19:45:10 +00:00
Lionel Landwerlin	97b3dd34c1	anv: fix missing VkPhysicalDeviceExtendedDynamicState3PropertiesEXT handling Fixes: `13c422e1b2` ("anv: toggle on EXT_extended_dynamic_state3") Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19573>	2022-11-08 15:28:57 +00:00
Caio Oliveira	22d8ed84b8	intel/compiler: Remove unused fs_visitor::emit_percomp() Since `7ef7738a61` ("i965: Write gl_FragCoord directly to the destination.") this is not used. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19586>	2022-11-08 07:33:09 +00:00
Caio Oliveira	90861e6fea	intel/compiler: Remove various unused function declarations Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19586>	2022-11-08 07:33:08 +00:00
Caio Oliveira	48506a9029	intel/compiler: Remove unused data members Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19586>	2022-11-08 07:33:08 +00:00
Ian Romanick	9abeb3d739	intel/fs: Optimize integer multiplication of large constants by factoring Many Intel platforms can only perform 32x16 bit multiplication. The straightforward way to implement 32x32 bit multiplications is by splitting one of the operands into high and low parts called H and L, repsectively. The full multiplication can be implemented as: ((A * H) << 16) + (A * L) On Intel platforms, special register accesses can be used to eliminate the shift operation. This results in three instructions and a temporary register for most values. If H or L is 1, then one (or both) of the multiplications will later be eliminated. On some platforms it may be possible to eliminate the multiplication when H is 256. If L is zero (note that H cannot be zero), one of the multiplications will also be eliminated. Instead of splitting the operand into high and low parts, it may possible to factor the operand into two 16-bit factors X and Y. The original multiplication can be replaced with (A * (X * Y)) = ((A * X) * Y). This requires two instructions without a temporary register. I may have gone a bit overboard with optimizing the factorization routine. It was a fun brainteaser, and I couldn't put it down. :) On my 1.3GHz Ice Lake, a standalone test could chug through 1,000,000 randomly selected values in about 5.7 seconds. This is about 9x the performance of the obvious, straightforward implementation that I started with. v2: Drop an unnecessary return. Rearrange logic slightly and rename variables in factor_uint32 to better match the names used in the large comment. Both suggested by Caio. Rearrange logic to avoid possibly using `a` uninitialized. Noticed by Marcin. v3: Use DIV_ROUND_UP instead of open coding it. Noticed by Caio. Tiger Lake, Ice Lake, Haswell, and Ivy Bridge had similar results. (Ice Lake shown) total instructions in shared programs: 19912558 -> 19912526 (<.01%) instructions in affected programs: 3432 -> 3400 (-0.93%) helped: 10 / HURT: 0 total cycles in shared programs: 856413218 -> 856412810 (<.01%) cycles in affected programs: 122032 -> 121624 (-0.33%) helped: 9 / HURT: 0 No shader-db changes on any other Intel platforms. Tiger Lake and Ice Lake had similar results. (Ice Lake shown) Instructions in all programs: 141997227 -> 141996923 (-0.0%) Instructions helped: 71 Cycles in all programs: 9162524757 -> 9162523886 (-0.0%) Cycles helped: 63 Cycles hurt: 5 No fossil-db changes on any other Intel platforms. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17718>	2022-11-08 00:02:16 +00:00
Ian Romanick	5ec75ca10d	intel/compiler: Teach signed integer range analysis about imax and imin This is especially helpful for a*isign(a) generated by idiv_by_const optimization. On many GPUs, isign(a) is lowered to imax(imin(a, 1), -1). There are no changes on fossil-db because ANV uses a different optimization path for idiv with a constant denominator. A future MR will change this. NOTE: This commit used to help a few hundred shader-db shaders, but now none are affected. I suspect this is due to some change in the idiv_by_const optimization. This could possibly be dropped. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17718>	2022-11-08 00:02:16 +00:00
Ian Romanick	1b0da3a765	intel/compiler: Signed integer range analysis for imul_32x16 generation Only iabs and ineg are treated specially. Everything else just uses nir_unsigned_upper_bound. The special treatment of source modifiers is because they cause problems for nir_unsigned_upper_bound. Once those are peeled off, nir_unsigned_upper_bound can generally produce a tighter bound. Future commits will add more opcodes. This mostly introduces the basic framework. v2: Add a bunch of comments to signed_integer_range_analysis. Re-arrange the code a little to reduce duplication. Both suggested by Caio. Rearrange some logic to simplify things. Suggested by Marcin. Tiger Lake, Ice Lake, Haswell, and Ivy Bridge had similar results. (Ice Lake shown) total instructions in shared programs: 19912894 -> 19912558 (<.01%) instructions in affected programs: 109275 -> 108939 (-0.31%) helped: 74 / HURT: 0 total cycles in shared programs: 856422769 -> 856413218 (<.01%) cycles in affected programs: 15268102 -> 15258551 (-0.06%) helped: 65 / HURT: 4 total fills in shared programs: 8218 -> 8217 (-0.01%) fills in affected programs: 1171 -> 1170 (-0.09%) helped: 1 / HURT: 0 Skylake and Broadwell had similar results. (Skylake shown) total cycles in shared programs: 845145547 -> 845142263 (<.01%) cycles in affected programs: 15261465 -> 15258181 (-0.02%) helped: 65 / HURT: 0 Tiger Lake Tiger Lake Instructions in all programs: 157580768 -> 157579730 (-0.0%) Instructions helped: 312 Instructions hurt: 28 Cycles in all programs: 7566977172 -> 7566967746 (-0.0%) Cycles helped: 288 Cycles hurt: 53 Spills in all programs: 19701 -> 19700 (-0.0%) Spills helped: 2 Spills hurt: 4 Fills in all programs: 33311 -> 33335 (+0.1%) Fills helped: 5 Fills hurt: 4 Ice Lake Instructions in all programs: 141998667 -> 141997227 (-0.0%) Instructions helped: 420 Instructions hurt: 3 Cycles in all programs: 9162565297 -> 9162524757 (-0.0%) Cycles helped: 389 Cycles hurt: 29 Spills in all programs: 19918 -> 19916 (-0.0%) Spills helped: 2 Spills hurt: 3 Fills in all programs: 32795 -> 32814 (+0.1%) Fills helped: 6 Fills hurt: 3 Skylake Instructions in all programs: 132567691 -> 132567745 (+0.0%) Instructions hurt: 24 Cycles in all programs: 8828897462 -> 8828889517 (-0.0%) Cycles helped: 405 Cycles hurt: 6 Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17718>	2022-11-08 00:02:16 +00:00
Ian Romanick	f90d71055b	intel/compiler: Add and use a pass to generate imul_32x16 instructions Gfx8 and Gfx9 platforms are helped for cycles because now many instructions like mul(8) g12<1>D g10<8,8,1>D 6D become mul(8) g12<1>D g10<8,8,1>D 6W It is the same number of instructions, but the 32x16 multiply is a little faster. v2: Fix transposed hi and lo in "(hi >= INT16_MIN && lo <= INT16_MAX)". Noticed by Caio. Use nir_src_is_const instead of open coding it. Suggested by Caio. Broadwell and Skylake had similar results. (Skylake shown) total cycles in shared programs: 845748380 -> 845145547 (-0.07%) cycles in affected programs: 446346348 -> 445743515 (-0.14%) helped: 6017 HURT: 0 helped stats (abs) min: 2 max: 7380 x̄: 100.19 x̃: 8 helped stats (rel) min: <.01% max: 3.72% x̄: 0.41% x̃: 0.39% 95% mean confidence interval for cycles value: -113.37 -87.00 95% mean confidence interval for cycles %-change: -0.42% -0.41% Cycles are helped. Skylake Cycles in all programs: 8844820715 -> 8828897462 (-0.2%) Cycles helped: 47914 Cycles hurt: 1 No shader-db or fossil-db changes on any other Intel platform. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17718>	2022-11-08 00:02:16 +00:00
Ian Romanick	9479e3a19b	intel/fs: Allow constant copy prop from DW to W This enables copy propagation of mov(8) g5<1>UD 0x00000180UD mul(8) g10<1>D g2.3<0,1,0>D g5<16,8,2>W into mul(8) g10<1>D g2.3<0,1,0>D 180W This is necessary for any optimization passes that generate imul_32x16 instructions. No fossil-db or shader-db changes on any Intel platform. v2: Fix type size check to (src size != 2) \|\| (dest size != 4). It was previously &&. :( This allowed copying constants into UB sources, and that is invalid. v3: Fix incorrect extraction of upper 16-bits of immediate value when subnr=2. Noticed by Caio. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17718>	2022-11-08 00:02:16 +00:00
Ian Romanick	90d267b2d1	intel/fs: Fix bounds checking for integer multiplication lowering The previous bounds checking would cause mul(8) g121<1>D g120<8,8,1>D 0xec4dD to be lowered to mul(8) g121<1>D g120<8,8,1>D 0xec4dUW mul(8) g41<1>D g120<8,8,1>D 0x0000UW add(8) g121.1<2>UW g121.1<16,8,2>UW g41<16,8,2>UW Instead of picking the bounds (and the new type) based on the old type, pick the new type based on the value only. This helps a few fossil-db shaders in Witcher 3 and Geekbench5. No changes on any other Intel platforms. Tiger Lake Instructions in all programs: 157581069 -> 157580768 (-0.0%) Instructions helped: 24 Cycles in all programs: 7566979620 -> 7566977172 (-0.0%) Cycles helped: 22 Cycles hurt: 4 Ice Lake Instructions in all programs: 141998965 -> 141998667 (-0.0%) Instructions helped: 26 Cycles in all programs: 9162568666 -> 9162565297 (-0.0%) Cycles helped: 24 Cycles hurt: 2 Skylake No changes. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17718>	2022-11-08 00:02:16 +00:00
Ian Romanick	db20412168	intel/fs: Fix constant propagation into 32x16 integer multiplication Don't copy propagate the constant in situations like mov(8) g8<1>D 0x7fffffffD mul(8) g16<1>D g8<8,8,1>D g15<16,8,2>W On platforms that only have a 32x16 multiplier, this will result in lowering the multiply to mul(8) g15<1>D g14<8,8,1>D 0xffffUW mul(8) g16<1>D g14<8,8,1>D 0x7fffUW add(8) g15.1<2>UW g15.1<16,8,2>UW g16<16,8,2>UW On Gfx8 and Gfx9, which have the full 32x32 multiplier, it results in mul(8) g16<1>D g15<16,8,2>W 0x7fffffffD Volume 2a of the Skylake PRM says: When multiplying a DW and any lower precision integer, the DW operand must on src0. See also https://gitlab.freedesktop.org/mesa/crucible/-/merge_requests/104. Previous to INTEL_shader_integer_functions2 (in Vulkan or OpenGL), I don't think it would be possible to create a situation where this could occur. I discovered this via some optimizations that can determine that the non-constant source must be able to fit in 16-bits. The case listed above came from piglit's "ext_transform_feedback-order arrays points" with those optimizations in place. No shader-db or fossil-db changes on any Intel platform. Fixes: `de6c0f8487` ("intel/fs: Implement support for NIR opcodes for INTEL_shader_integer_functions2") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17718>	2022-11-08 00:02:16 +00:00
José Roberto de Souza	41ee836c9a	intel: Add and use intel_gem_can_render_on_fd() Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19425>	2022-11-07 17:22:14 +00:00
José Roberto de Souza	29550bc50a	intel: Add has_context_isolation to intel_device_info Iris, hasvk and anv were fetching the same information, better do it on one place. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19425>	2022-11-07 17:22:14 +00:00
José Roberto de Souza	d5d1331381	intel: Add has_userptr_probe to intel_device_info Iris, hasvk and anv were fetching the same information, better do it on one place. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19425>	2022-11-07 17:22:14 +00:00
José Roberto de Souza	e9eceb1106	intel: Add has_mmap_offset to intel_device_info All 4 drivers were fetching the same information, better do it on one place. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19425>	2022-11-07 17:22:14 +00:00
José Roberto de Souza	dfd20f002f	intel: Add and use intel_gem_get_param() Again sharing the same function across all Intel drivers. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19425>	2022-11-07 17:22:14 +00:00
Jason Ekstrand	402a9a36f0	anv: Rip out shadow surfaces These are only used for storage-compatible compressed surfaces on Broadwell and earlier and Stencil on Gfx7 where there isn't proper stencil sampling support. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18402>	2022-11-07 12:07:11 +00:00
Francisco Jerez	5d4df3ac23	intel/compiler: Run extra fp64 lowering pass on devices that don't support int64. In some cases nir_lower_int64 will emit fp64 operations which aren't natively supported on any Intel hardware (e.g. ftrunc, frem). An extra pass of nir_opt_algebraic (for frem) and nir_lower_doubles is required in order to take care of them. This fixes several int64 test-cases on MTL hardware. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Mykhailo Skorokhodov <mykhailo.skorokhodov@globallogic.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19390>	2022-11-07 07:35:22 +00:00
Tomeu Vizoso	ec9b9ff971	ci: Disable automatic jobs on Chromebooks with Comet Lake During the weekend they started to show network problems so often that they are unable to take on jobs. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19566>	2022-11-07 06:50:16 +01:00
Benjamin Tissoires	67cee534a8	CI: convert to use the new S3 server instead of the legacy minio We don't need to login anymore, but we can't use plain minio commands now. `ci-fairy` got a helper as `s3cp` to keep an almost identical API. Signed-off-by: Benjamin Tissoires <benjamin.tissoires@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19076>	2022-11-04 11:41:42 +00:00
Mauro Rossi	814b822fe0	hasvk: fix android build and reported API version anv_device.c for vulkan.intel_hasvk requires changes to be compiled and behave correctly for android target Fixes the following building error: FAILED: src/intel/vulkan_hasvk/libanv_hasvk_common.a.p/anv_device.c.o ... ../src/intel/vulkan_hasvk/anv_device.c:143:19: error: use of undeclared identifier 'ANV_API_VERSION_1_3' *pApiVersion = ANV_API_VERSION_1_3; ^ ../src/intel/vulkan_hasvk/anv_device.c:1822:44: error: use of undeclared identifier 'ANV_API_VERSION_1_3' .apiVersion = pdevice->use_softpin ? ANV_API_VERSION_1_3 : ANV_API_VERSION_1_2, ^ ../src/intel/vulkan_hasvk/anv_device.c:1822:66: error: use of undeclared identifier 'ANV_API_VERSION_1_2' .apiVersion = pdevice->use_softpin ? ANV_API_VERSION_1_3 : ANV_API_VERSION_1_2, ^ 3 errors generated. Cc: "22.3" mesa-stable Fixes: `00eefdc` ("hasvk: stop advertising Vk 1.3 on non-softpin") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19452>	2022-11-03 23:33:14 +00:00
José Roberto de Souza	fd14fcb9f9	intel: Add and use intel_gem_get_context_param() Again sharing the same function across all Intel drivers. There is still two additional DRM_IOCTL_I915_GEM_CONTEXT_GETPARAM calls, one in intel/dev and other in perf. The first one can't call intel_gem_get_context_param() because of the build order of libs and the second one because it sets the size parameter. Will revisit those calls in future but this is already an improvement. v2: - using intel_gem_get_context_param() for the recently added query for I915_CONTEXT_PARAM_PROTECTED_CONTENT Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18974>	2022-11-03 21:01:30 +00:00
José Roberto de Souza	39486661e9	intel: Add and use intel_gem_set_context_param() Again sharing the same function across all Intel drivers. Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18974>	2022-11-03 21:01:30 +00:00
José Roberto de Souza	6ae6921216	intel: Add and use intel_gem_destroy_context() Again sharing the same function across all Intel drivers. Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18974>	2022-11-03 21:01:30 +00:00
José Roberto de Souza	f928ead625	intel: Add and use intel_gem_create_context() Add intel_gem_create_context() to common/intel_gem.c/h and use it on Iris, Crocus, ANV and HASVK. Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18974>	2022-11-03 21:01:30 +00:00
José Roberto de Souza	ce4a7e7d40	intel: Refactor intel_gem_create_context_engines() This function was returning a int but there was no meaninfull errno code being returned, also context_id is a uint32_t what would be problematic if i915 even returned 2147483648(-1). So here changing the return type and add context_id pointer parameter. Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18974>	2022-11-03 21:01:30 +00:00
José Roberto de Souza	5f7c2b0e16	intel/common: Add and use intel_gem_create_context_ext() v2: - added flag mask bit to enable context protected and recoverable v3: - added enum intel_gem_create_context_flags Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18974>	2022-11-03 21:01:30 +00:00
Lionel Landwerlin	ba0336ab3f	anv: Reduce RHWO optimization (Wa_1508744258) Implement Wa_1508744258: Disable RHWO by setting 0x7010[14] by default except during resolve pass. Disable the RCC RHWO optimization at all times except when resolving single sampled color surfaces. v2: Move stalling to genX(cmd_buffer_apply_pipe_flushes) for clarity (Mark) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Mark Janes <markjanes@swizzler.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19450>	2022-11-03 10:47:59 +00:00
Jordan Justen	d911eb17d8	intel/dev: Set has_lsc in XEHP_FEATURES rather than DG2_FEATURES MTL will want this set as well. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19447>	2022-11-02 20:00:08 +00:00
Marcin Ślusarz	dcaaeb56ef	anv: program 3DSTATE_MESH_DISTRIB with the recommended values It improves performance of vk_meshlet_cadscene on A770. Fixes: `f083df8710` ("anv: update task/mesh distribution with the recommended values") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19412>	2022-11-02 08:56:53 +00:00
Marcin Ślusarz	d1d2dee970	anv: set 3DSTATE_[MESH\|TASK]_CONTROL.MaximumNumberofThreadGroups Documentation is worded in a confusing way, which may be understood that we don't have to set this field to get good results. MESH part of this commit improves performance of vk_meshlet_cadscene by a factor of 2 on A380. Fixes: `ef04caea9b` ("anv: Implement Mesh Shading pipeline") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19412>	2022-11-02 08:56:53 +00:00
Marcin Ślusarz	11612d81b7	intel/genxml: fix width of 3DSTATE_TASK_CONTROL.MaximumNumberofThreadGroups Fixes: `3567d47f3e` ("intel/genxml: Inline the BODY structs into the instructions") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19412>	2022-11-02 08:56:53 +00:00
Illia Abernikhin	aa4ac5ff8b	utils: Merge util/debug.* into util/u_debug.* and remove util/debug.* Rename env_var_as_unsigned() -> debug_get_num_option(), because duplicate Rename env_var_as_bool() -> debug_get_bool_option(), because duplicate Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7177 Signed-off-by: Illia Abernikhin <illia.abernikhin@globallogic.com> Reviewed-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19336>	2022-11-02 07:25:39 +00:00
Kenneth Graunke	fde99747e9	nir: Drop infer_non_readable option for nir_opt_access() Everybody sets it to true now, and the only reason for the option to exist was to work around a bug that's now been fixed. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19162>	2022-11-02 03:42:04 +00:00
Kenneth Graunke	88756cee8d	intel/compiler: Run nir_opt_large_constants before scalarizing consts nir_opt_large_constants balks at seeing a store_deref of a variable where the source is a vecN operation of multiple load_consts, and thinks that isn't a constant, so it should not bother promoting it. Unfortunately, we were running nir_lower_load_const_to_scalar before nir_opt_large_constants, so this prevented a ton of constant promotion. This commit /used to help/ some shaders in shader-db. Presumably since !16770 landed, those shaders were already helped. Currently ther are no shader-db changes on any Intel platform. Fossil-db results: All Intel platforms had similar results. (Ice Lake shown) Instructions in all programs: 141998227 -> 141421756 (-0.4%) Instructions helped: 12515 Instructions hurt: 237 SENDs in all programs: 7437925 -> 7468033 (+0.4%) SENDs hurt: 12806 Cycles in all programs: 9161655753 -> 9132869800 (-0.3%) Cycles helped: 10163 Cycles hurt: 2637 Spills in all programs: 19977 -> 18678 (-6.5%) Spills helped: 384 Spills hurt: 40 Fills in all programs: 32863 -> 31396 (-4.5%) Fills helped: 385 Fills hurt: 42 Lost: 1 Lots of Shadow of the Tomb Raider fragment shaders and Batman Arkham Origins vertex shaders were hurt for SENDs in this commit. A couple Aztec Ruins compute shaders and Spaceship shaders (multiple stages) were also hurt. All of the shaders hurt for spills or fills were Spaceship compute shaders. Nearly all of the shaders helped were Shadow of the Tomb Raider fragmenet shaders. One Spaceship shader was reall, REALLY helped: Spills helped fossils/fossil-db/Spaceship.run.9f90a2a226fcc57f.1.foz/0b507d3abe2e3c28/compute: 321 -> 13 (-96.0%) Fills helped fossils/fossil-db/Spaceship.run.9f90a2a226fcc57f.1.foz/0b507d3abe2e3c28/compute: 279 -> 21 (-92.5%) Overall this seems like an improvement, but we may want to actually run these few benchmarks before landing. Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16539>	2022-11-01 14:55:21 -07:00
Nanley Chery	0fa540ef61	iris: Reduce use of RHWO optimization (Wa_1508744258) Implement Wa_1508744258: Disable RHWO by setting 0x7010[14] by default except during resolve pass. Disable the RCC RHWO optimization at all times except when resolving single sampled color surfaces. MCS partial resolves are done via software (i.e., not via a HW bit) and so are not expected to need this workaround. Reviewed-by: Mark Janes <markjanes@swizzler.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19360>	2022-10-31 23:26:06 +00:00
Tapani Pälli	7cfd0e8d31	hasvk: remove some unused functions Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19368>	2022-10-31 06:59:36 +00:00
Tapani Pälli	f9176d9b2c	anv: remove some unused functions Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19368>	2022-10-31 06:59:36 +00:00
Lionel Landwerlin	6b52834ece	anv: remove shader fp64 inspection after parsing Unfortunately some crucible tests are using all floating point widths in a single shader and specializing a variable to select what code path to use for a particular supported floating point width. This is reporting errors in the validation layers. Remove the validation for now. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes `8c4c4c3ee1` ("anv: Add softtp64 workaround") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Mykhailo Skorokhodov <mykhailo.skorokhodov@globallogic.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19401>	2022-10-30 22:16:52 +00:00
José Roberto de Souza	bb9f66800c	intel/perf: Use intel_device_info functions to compute subslice and eu totals Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19359>	2022-10-28 20:12:07 +00:00
Mykhailo Skorokhodov	a954933f4f	drirc: Add fp64_workaround_enabled option Signed-off-by: Mykhailo Skorokhodov <mykhailo.skorokhodov@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18854>	2022-10-28 10:08:50 +00:00
Mykhailo Skorokhodov	8c4c4c3ee1	anv: Add softtp64 workaround Pass float64.glsl into nir_lower_doubles() resolves the problem on ICL/TGL when the shader uses float64, but the device doesn't support that type. Signed-off-by: Mykhailo Skorokhodov <mykhailo.skorokhodov@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18854>	2022-10-28 10:08:50 +00:00
Mykhailo Skorokhodov	829d74b2f2	anv/meson: Add float64_spv_h custom target Signed-off-by: Mykhailo Skorokhodov <mykhailo.skorokhodov@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18854>	2022-10-28 10:08:50 +00:00
Lionel Landwerlin	920aed2121	intel/compiler: don't allocate compaction arrays on the stack Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7569 Cc: mesa-stable Reviewed-by: Luis Felipe Strano Moraes <luis.strano@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19339>	2022-10-28 07:10:58 +00:00
Matt Turner	3ef88cd0a2	intel/dev: Set display_ver = 13 on all ADL/RPL/DG2 display_ver doesn't seem to be used anywhere, but if that were to change, we'd want this to be consistent. Fixes: `c746bf4c5c` ("intel/dev: Add display_ver and set adl-p to 13") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19354>	2022-10-28 01:31:44 +00:00
Lionel Landwerlin	e59c4a912b	intel/fs: use fs implementation of dump_instructions This specialized version prints out the liveness count as well as the maximum liveness count. It was eye opening when seeing the max liveness jump after lowering of packing instructions which should not have changed the count. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18657>	2022-10-27 21:05:00 +00:00
Lionel Landwerlin	e5dfff0946	intel/fs: reduce liveness of variables in lowering passes When lowering a single instruction with a destination VGRF to 2 or more, the VGRF is now considered partially written by each generated instruction and that increases its liveness especially in loops. Thus potentially increasing the number of spills/fills due to register allocation. Putting an UNDEF instruction in front of the lowered instructions allows the IR to limit the liveness of the VGRF, reducing register pressure. This has a pretty dramatic effect on spills/fills for RT shaders. Here the stats on Q2RTX shaders on DG2 (wipping out any spills/fills due to register allocation) : Instructions in all programs: 26150 -> 24955 (-4.6%) SENDs in all programs: 1148 -> 1148 (+0.0%) Loops in all programs: 4 -> 4 (+0.0%) Cycles in all programs: 392179 -> 332787 (-15.1%) Spills in all programs: 132 -> 116 (-12.1%) Fills in all programs: 262 -> 154 (-41.2%) Shader-db results on TGL : total instructions in shared programs: 21158140 -> 21158377 (<.01%) instructions in affected programs: 76629 -> 76866 (0.31%) helped: 18 HURT: 20 helped stats (abs) min: 1 max: 60 x̄: 18.89 x̃: 12 helped stats (rel) min: 0.21% max: 3.61% x̄: 1.02% x̃: 0.77% HURT stats (abs) min: 1 max: 79 x̄: 28.85 x̃: 18 HURT stats (rel) min: 0.04% max: 2.81% x̄: 1.13% x̃: 0.79% 95% mean confidence interval for instructions value: -4.82 17.30 95% mean confidence interval for instructions %-change: -0.34% 0.57% Inconclusive result (value mean confidence interval includes 0). total loops in shared programs: 5753 -> 5753 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 798856834 -> 798870688 (<.01%) cycles in affected programs: 6208395 -> 6222249 (0.22%) helped: 22 HURT: 17 helped stats (abs) min: 2 max: 8794 x̄: 1438.18 x̃: 782 helped stats (rel) min: 0.05% max: 2.28% x̄: 0.63% x̃: 0.44% HURT stats (abs) min: 2 max: 19178 x̄: 2676.12 x̃: 1358 HURT stats (rel) min: 0.04% max: 23.49% x̄: 2.25% x̃: 0.71% 95% mean confidence interval for cycles value: -952.19 1662.65 95% mean confidence interval for cycles %-change: -0.64% 1.90% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 4078 -> 4066 (-0.29%) spills in affected programs: 40 -> 28 (-30.00%) helped: 2 HURT: 0 total fills in shared programs: 2856 -> 2832 (-0.84%) fills in affected programs: 127 -> 103 (-18.90%) helped: 2 HURT: 0 total sends in shared programs: 998554 -> 998554 (0.00%) sends in affected programs: 0 -> 0 helped: 0 HURT: 0 LOST: 0 GAINED: 0 Total CPU time (seconds): 2346.06 -> 2304.80 (-1.76%) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18657>	2022-10-27 21:05:00 +00:00
Lionel Landwerlin	dd6d40429b	intel/fs: make split_virtual_grfs deal with partial undefs v2: fix up UNDEFs instructions (Curro) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18657>	2022-10-27 21:05:00 +00:00
Lionel Landwerlin	14b99df7d9	intel/fs: require UNDEFs register offsets to be aligned to REG_SIZE Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18657>	2022-10-27 21:05:00 +00:00

1 2 3 4 5 ...

8616 commits