fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-21 18:00:13 +01:00

Author	SHA1	Message	Date
Dave Airlie	9bb375b0be	intel/compiler: drop glsl options from brw_compiler Only the nir options are used now, since i965 was dropped, the glsl options come from the state tracker Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14102>	2021-12-07 08:52:36 +00:00
Caio Oliveira	db23c41537	intel/compiler: Add backend compiler basics for Task/Mesh Task/Mesh stages are CS-like stages, and include many builtins (e.g. workgroup ID/index) and intrinsics (e.g. workgroup memory primitives) originally present only in CS. This commit add two new stages (task and mesh) that 'inherit' from CS by embedding a brw_cs_prog_data in their own prog_data structure, so that CS functionality can be easily reused. They also currently use the same helpers to select the SIMD variant to use -- that was recently added for CS. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13661>	2021-12-04 00:41:46 +00:00
Marcin Ślusarz	d05f7b4a2c	intel: fix INTEL_DEBUG environment variable on 32-bit systems INTEL_DEBUG is defined (since `4015e1876a`) as: #define INTEL_DEBUG __builtin_expect(intel_debug, 0) which unfortunately chops off upper 32 bits from intel_debug on platforms where sizeof(long) != sizeof(uint64_t) because __builtin_expect is defined only for the long type. Fix this by changing the definition of INTEL_DEBUG to be function-like macro with "flags" argument. New definition returns 0 or 1 when any of the flags match. Most of the changes in this commit were generated using: for c in `git grep INTEL_DEBUG \| grep "&" \| grep -v i915 \| awk -F: '{print $1}' \| sort \| uniq`; do perl -pi -e "s/INTEL_DEBUG & ([A-Z0-9a-z_]+)/INTEL_DBG(\1)/" $c perl -pi -e "s/INTEL_DEBUG & ($[A-Z0-9_ \|]+$)/INTEL_DBG\1/" $c done but it didn't handle all cases and required minor cleanups (like removal of round brackets which were not needed anymore). Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13334>	2021-10-15 19:55:14 +00:00
Ian Romanick	0f809dbf40	intel/compiler: Basic support for DP4A instruction v2: Very significant rebase on changes to previous commits. Specifically, brw_fs_nir.cpp changes were pretty much rewritten from scratch after changing the NIR opcode names and types. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142>	2021-08-24 19:58:57 +00:00
Ian Romanick	5ce3bfcdf3	intel/compiler: Lower 8-bit ops to 16-bit in NIR on all platforms This fixes the Crucible func.shader.shift.int8_t test on Gen8 and Gen9. See https://gitlab.freedesktop.org/mesa/crucible/-/merge_requests/76. With the previous optimizations in place, this change seems to improve the quality of the generated code. Comparing a couple Vulkan CTS tests on Skylake had the following results. dEQP-VK.spirv_assembly.type.vec3.i8.bitwise_xor_frag: SIMD8 shader: 36 instructions. 1 loops. 3822 cycles. 0:0 spills:fills, 5 sends SIMD8 shader: 27 instructions. 1 loops. 2742 cycles. 0:0 spills:fills, 5 sends dEQP-VK.spirv_assembly.type.vec3.i8.max_frag: SIMD8 shader: 39 instructions. 1 loops. 3922 cycles. 0:0 spills:fills, 5 sends SIMD8 shader: 37 instructions. 1 loops. 3682 cycles. 0:0 spills:fills, 5 sends Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>	2021-08-18 22:03:37 +00:00
Ian Romanick	f0a8a9816a	nir: intel/compiler: Add and use nir_op_pack_32_4x8_split A lot of CTS tests write a u8vec4 or an i8vec4 to an SSBO. This results in a lot of shifts and MOVs. When that pattern can be recognized, the individual 8-bit components can be packed much more efficiently. v2: Rebase on `b4369de27f` ("nir/lower_packing: use shader_instructions_pass") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>	2021-08-18 22:03:37 +00:00
Rhys Perry	ed70b256ce	nir: add ffma creation helpers Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8056>	2021-08-16 17:19:45 +00:00
Timothy Arceri	a9ed4538ab	nir: add indirect loop unrolling to compiler options This is where it should be rather than having to pass it into the optimisation pass every time. It also allows us to call the loop analysis pass without having to duplicate these options which we will do later in this series. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12064>	2021-08-03 10:54:50 +00:00
Sagar Ghuge	89b7a40212	intel/compiler: Enable has_iadd3 option on XeHP shader-db result is inconclusive but doesn't harm to include it for reference. Shader-db result on XeHPG: total instructions in shared programs: 1397405 -> 1397315 (<.01%) instructions in affected programs: 88252 -> 88162 (-0.10%) helped: 20 HURT: 7 helped stats (abs) min: 1 max: 18 x̄: 7.20 x̃: 7 helped stats (rel) min: 0.03% max: 2.20% x̄: 0.37% x̃: 0.23% HURT stats (abs) min: 4 max: 23 x̄: 7.71 x̃: 4 HURT stats (rel) min: 0.10% max: 0.68% x̄: 0.22% x̃: 0.11% 95% mean confidence interval for instructions value: -6.81 0.14 95% mean confidence interval for instructions %-change: -0.42% -0.02% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 119924219 -> 119931868 (<.01%) cycles in affected programs: 45029193 -> 45036842 (0.02%) helped: 11 HURT: 16 helped stats (abs) min: 15 max: 5490 x̄: 1655.73 x̃: 140 helped stats (rel) min: <.01% max: 0.35% x̄: 0.11% x̃: <.01% HURT stats (abs) min: 1 max: 2944 x̄: 1616.38 x̃: 1743 HURT stats (rel) min: <.01% max: 0.17% x̄: 0.09% x̃: 0.10% 95% mean confidence interval for cycles value: -606.11 1172.70 95% mean confidence interval for cycles %-change: -0.04% 0.07% Inconclusive result (value mean confidence interval includes 0). v2: - Include shader-db result (Jason) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11596>	2021-07-16 15:59:56 +00:00
Andrii Simiklit	57f54bb9cc	Remove redundant assignment Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4957 Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11780>	2021-07-09 09:34:27 +00:00
Jason Ekstrand	d055ac9bdf	intel/compiler: Add a U32 reloc type Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>	2021-06-22 21:09:25 +00:00
Jason Ekstrand	55508bbe66	intel/compiler: Generalize shader relocations a bit This commit adds a delta to be added to the relocated value as well as the possibility of multiple types of relocations. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>	2021-06-22 21:09:25 +00:00
Rhys Perry	1cbcfb8b38	nir, nir/algebraic: add byte/word insertion instructions Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>	2021-06-08 08:57:42 +00:00
Anuj Phogat	61e8636557	intel: Rename gen_device prefix to intel_device export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965" grep -E "gen_device" -rIl $SEARCH_PATH \| xargs sed -ie "s/gen_device/intel_device/g" Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10241>	2021-04-20 20:06:33 +00:00
Anuj Phogat	926d343acf	intel: Rename files with gen_debug prefix export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965" find $SEARCH_PATH -type f -name "gen_debug.[cph]" -exec sh -c 'f="{}"; mv -- "$f" "${f/gen_debug/intel_debug}"' \; grep -E "gen_debug" -rIl $SEARCH_PATH \| xargs sed -ie "s/gen_debug\./intel_debug\./g" grep -E "GEN_DEBUG" -rIl $SEARCH_PATH \| xargs sed -ie "s/GEN_DEBUG_H/INTEL_DEBUG_H/g" Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10241>	2021-04-20 20:06:33 +00:00
Anuj Phogat	1d296484b4	intel: Rename Genx keyword to Gfxx Commands used to do the changes: export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965" grep -E "Gen[[:digit:]]+" -rIl $SEARCH_PATH \| xargs sed -ie "s/Gen$[[:digit:]]\+$/Gfx\1/g" Exclude changes in src/intel/perf/oa-.xml: find src/intel/perf -type f $ -name ".xml" $ \| xargs sed -ie "s/Gfx/Gen/g" Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>	2021-04-02 18:33:07 +00:00
Anuj Phogat	abe9a71a09	intel: Rename gen field in gen_device_info struct to ver Commands used to do the changes: export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965" grep -E "info\)(.\|->)gen" -rIl $SEARCH_PATH \| xargs sed -ie "s/info$)$$\.\\|->$gen/info\1\2ver/g" Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>	2021-04-02 18:33:07 +00:00
Christian Gmeiner	3fbde2fd93	nir: add has_txs flag Some nir lowerings might need to know if txs is supported by the backend. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8898>	2021-02-23 14:04:30 +00:00
Daniel Schürmann	bd8e84eb8d	nir: replace .lower_sub with .has_fsub and .has_isub This allows a more fine-grained control about whether a backend supports one of these instructions. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6597>	2021-01-11 19:13:51 +00:00
Jason Ekstrand	7280b0911d	intel/compiler: Add support for bindless shaders The Intel bindless thread dispatch model is very simple. When a compute shader is to be used for bindless dispatch, it can request a set of stack IDs. These are allocated per-dual-subslice by the hardware and recycled automatically when the stack ID is returned. Passed to the bindless dispatch are a global argument address, a stack ID, and an address of the BINDLESS_SHADER_RECORD to invoke. When the bindless shader is dispatched, it is passed its stack ID as well as the global and local argument pointers. The local argument pointer is the address of the BINDLESS_SHADER_RECORD plus some offset which is specified as part of the BINDLESS_SHADER_RECORD. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7356>	2020-11-25 05:37:09 +00:00
Jason Ekstrand	68092df8d8	intel/nir: Lower 8-bit ops to 16-bit in NIR on Gen11+ Intel hardware supports 8-bit arithmetic but it's tricky and annoying: - Byte operations don't actually execute with a byte type. The execution type for byte operations is actually word. (I don't know if this has implications for the HW implementation. Probably?) - Destinations are required to be strided out to at least the execution type size. This means that B-type operations always have a stride of at least 2. This means wreaks havoc on the back-end in multiple ways. - Thanks to the strided destination, we don't actually save register space by storing things in bytes. We could, in theory, interleave two byte values into a single 2B-strided register but that's both a pain for RA and would lead to piles of false dependencies pre-Gen12 and on Gen12+, we'd need some significant improvements to the SWSB pass. - Also thanks to the strided destination, all byte writes are treated as partial writes by the back-end and we don't know how to copy-prop them. - On Gen11, they added a new hardware restriction that byte types aren't allowed in the 2nd and 3rd sources of instructions. This means that we have to emit B->W conversions all over to resolve things. If we emit said conversions in NIR, instead, there's a chance NIR can get rid of some of them for us. We can get rid of a lot of this pain by just asking NIR to get rid of 8-bit arithmetic for us. It may lead to a few more conversions in some cases but having back-end copy-prop actually work is probably a bigger bonus. There is still a bit we have to handle in the back-end. In particular, basic MOVs and conversions because 8-bit load/store ops still require 8-bit types. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7482>	2020-11-09 18:58:51 +00:00
Jason Ekstrand	3d22de05ca	intel/fs: Add an option to use dataport messages for UBOs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3932>	2020-10-08 01:17:06 -05:00
Ian Romanick	60e1d0f028	intel/compiler: Remove INTEL_SCALAR_... env variables Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6826>	2020-09-28 11:43:10 -07:00
Kenneth Graunke	140f53e646	Revert "nir: replace lower_ffma and fuse_ffma with has_ffma" This reverts commit `939ddf3f67`. Intel has a separate pass for fusing FFMAs selectively. We split these flags in commit `1b72c31e1f` and the reasoning still stands. The patch being reverted was just a cleanup, so there should be no issue with reverting it. Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6849>	2020-09-24 13:11:50 -07:00
Marek Olšák	939ddf3f67	nir: replace lower_ffma and fuse_ffma with has_ffma Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6756>	2020-09-24 12:29:11 +00:00
Marek Olšák	771aad3027	nir: split lower_ffma into lower_ffma16/32/64 AMD wants different behavior for each bit size Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6756>	2020-09-24 12:29:11 +00:00
Gert Wollny	80cde3ad55	intel/compiler: Set lower_uniform_to_ubo compiler flag Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6316>	2020-09-16 10:07:42 +00:00
Jason Ekstrand	c897cd0278	intel/compiler: Handle all indirect lowering choices in brw_nir.c Since everything flows through NIR and we're doing all of our indirect deref lowering there now, there's no reason to keep making those decisions in brw_compiler and stuffing them in the GLSL compiler structs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5909>	2020-09-03 14:26:49 +00:00
Jason Ekstrand	8d8a3815ef	intel/eu: Add a mechanism for emitting relocatable constant MOVs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6244>	2020-09-02 19:48:44 +00:00
Jason Ekstrand	54ba0daa28	intel/compiler: Get rid of the global compaction table pointers With discrete GPUs, it's going to be possible to have GPUs from two different hardware generations in the machine at the same time. Global singletons like this aren't going to fly. Have a struct containing the pointers which gets initialized once per shader disassemble instead. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6244>	2020-09-02 19:48:44 +00:00
Jason Ekstrand	003b04e266	intel/compiler: Allow MESA_SHADER_KERNEL Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6280>	2020-08-12 10:11:06 +00:00
Caio Marcelo de Oliveira Filho	e2b6ccbdad	intel/compiler: Use C99 array initializers for prog_data/key sizes This is way better than a pile of STATIC_ASSERTs. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6280>	2020-08-12 10:11:06 +00:00
Boris Brezillon	025988f818	intel: Set int64_options to ~0 when lowering 64b ops That's more future proof than setting each bit manually. Looks like we already miss nir_lower_ufind_msb64 because of that. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5588>	2020-07-30 16:54:24 +00:00
Boris Brezillon	345b5847b4	nir: Replace the scoped_memory barrier by a scoped_barrier SPIRV OpControlBarrier can have both a memory and a control barrier which some hardware can handle with a single instruction. Let's turn the scoped_memory_barrier into a scoped barrier which can embed both barrier types. Note that control-only or memory-only barriers can be supported through this new intrinsic by passing NIR_SCOPE_NONE to the unused barrier type. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Suggested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4900>	2020-06-03 07:39:52 +00:00
Ian Romanick	f46eabf84e	nir/algebraic: Split ibfe and ubfe with two constant sources I also tried splitting ubfe instructions with one or zero constants, and zero shaders in shader-db were affected. The "lost" shader is a compute shader that was promoted from SIMD8 to SIMD16, so is also counted as the gained shader. v2: Further restrict bfe splitting. bfe with multiple constants is better on at least some Radeon GPUs. Use -x instead of 32-x in shift counts. v3: Fix the outer shift count for ibfe lowering. Add c=0 optimizations to prevent bad lowering. Both suggested by Rhys. Add shift by -32 optimizations. Tiger Lake total instructions in shared programs: 17608764 -> 17596316 (-0.07%) instructions in affected programs: 303765 -> 291317 (-4.10%) helped: 113 HURT: 46 helped stats (abs) min: 1 max: 458 x̄: 120.67 x̃: 21 helped stats (rel) min: 0.09% max: 11.23% x̄: 3.47% x̃: 1.39% HURT stats (abs) min: 1 max: 201 x̄: 25.83 x̃: 6 HURT stats (rel) min: 0.23% max: 5.18% x̄: 1.53% x̃: 1.11% 95% mean confidence interval for instructions value: -101.13 -55.45 95% mean confidence interval for instructions %-change: -2.61% -1.44% Instructions are helped. total cycles in shared programs: 338390770 -> 333530868 (-1.44%) cycles in affected programs: 79438330 -> 74578428 (-6.12%) helped: 112 HURT: 64 helped stats (abs) min: 2 max: 268955 x̄: 44261.93 x̃: 1452 helped stats (rel) min: <.01% max: 29.51% x̄: 4.72% x̃: 2.23% HURT stats (abs) min: 2 max: 17618 x̄: 1522.41 x̃: 84 HURT stats (rel) min: <.01% max: 7.34% x̄: 1.35% x̃: 0.34% 95% mean confidence interval for cycles value: -37232.47 -17993.69 95% mean confidence interval for cycles %-change: -3.37% -1.65% Cycles are helped. total spills in shared programs: 8944 -> 8138 (-9.01%) spills in affected programs: 3240 -> 2434 (-24.88%) helped: 67 HURT: 0 total fills in shared programs: 9373 -> 7842 (-16.33%) fills in affected programs: 4736 -> 3205 (-32.33%) helped: 67 HURT: 0 LOST: 1 GAINED: 2 Ice Lake and Skylake had similar results. (Ice Lake shown) total instructions in shared programs: 16123288 -> 16116876 (-0.04%) instructions in affected programs: 241155 -> 234743 (-2.66%) helped: 126 HURT: 2 helped stats (abs) min: 1 max: 209 x̄: 50.90 x̃: 7 helped stats (rel) min: 0.07% max: 5.94% x̄: 1.76% x̃: 0.65% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.05% max: 0.24% x̄: 0.15% x̃: 0.15% 95% mean confidence interval for instructions value: -61.29 -38.89 95% mean confidence interval for instructions %-change: -2.05% -1.42% Instructions are helped. total cycles in shared programs: 335419163 -> 330438819 (-1.48%) cycles in affected programs: 77515502 -> 72535158 (-6.42%) helped: 139 HURT: 37 helped stats (abs) min: 2 max: 269140 x̄: 36374.19 x̃: 597 helped stats (rel) min: <.01% max: 28.60% x̄: 3.67% x̃: 1.31% HURT stats (abs) min: 4 max: 17618 x̄: 2045.08 x̃: 174 HURT stats (rel) min: 0.02% max: 8.32% x̄: 2.61% x̃: 0.62% 95% mean confidence interval for cycles value: -37799.30 -18795.51 95% mean confidence interval for cycles %-change: -3.13% -1.57% Cycles are helped. total spills in shared programs: 8065 -> 7306 (-9.41%) spills in affected programs: 3153 -> 2394 (-24.07%) helped: 67 HURT: 0 total fills in shared programs: 8710 -> 7412 (-14.90%) fills in affected programs: 4466 -> 3168 (-29.06%) helped: 67 HURT: 0 LOST: 1 GAINED: 1 Broadwell total instructions in shared programs: 14970538 -> 14965967 (-0.03%) instructions in affected programs: 227040 -> 222469 (-2.01%) helped: 126 HURT: 2 helped stats (abs) min: 1 max: 136 x̄: 36.29 x̃: 8 helped stats (rel) min: 0.07% max: 6.02% x̄: 1.47% x̃: 0.89% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.05% max: 0.24% x̄: 0.14% x̃: 0.14% 95% mean confidence interval for instructions value: -43.05 -28.37 95% mean confidence interval for instructions %-change: -1.69% -1.19% Instructions are helped. total cycles in shared programs: 336237662 -> 333035960 (-0.95%) cycles in affected programs: 72066394 -> 68864692 (-4.44%) helped: 134 HURT: 42 helped stats (abs) min: 4 max: 122640 x̄: 24344.54 x̃: 1833 helped stats (rel) min: <.01% max: 26.93% x̄: 4.02% x̃: 2.38% HURT stats (abs) min: 1 max: 17205 x̄: 1439.69 x̃: 92 HURT stats (rel) min: <.01% max: 7.12% x̄: 1.34% x̃: 0.62% 95% mean confidence interval for cycles value: -23753.58 -12629.40 95% mean confidence interval for cycles %-change: -3.50% -1.98% Cycles are helped. total spills in shared programs: 21122 -> 20204 (-4.35%) spills in affected programs: 3644 -> 2726 (-25.19%) helped: 67 HURT: 0 total fills in shared programs: 24879 -> 23460 (-5.70%) fills in affected programs: 4883 -> 3464 (-29.06%) helped: 67 HURT: 0 Haswell total instructions in shared programs: 13148269 -> 13145444 (-0.02%) instructions in affected programs: 137046 -> 134221 (-2.06%) helped: 97 HURT: 3 helped stats (abs) min: 1 max: 137 x̄: 30.58 x̃: 3 helped stats (rel) min: 0.14% max: 4.38% x̄: 1.38% x̃: 0.44% HURT stats (abs) min: 1 max: 70 x̄: 47.00 x̃: 70 HURT stats (rel) min: 0.05% max: 5.82% x̄: 3.90% x̃: 5.82% 95% mean confidence interval for instructions value: -37.15 -19.35 95% mean confidence interval for instructions %-change: -1.56% -0.89% Instructions are helped. total cycles in shared programs: 321221834 -> 318333159 (-0.90%) cycles in affected programs: 54932349 -> 52043674 (-5.26%) helped: 95 HURT: 53 helped stats (abs) min: 4 max: 123390 x̄: 30648.39 x̃: 702 helped stats (rel) min: <.01% max: 28.87% x̄: 4.27% x̃: 2.87% HURT stats (abs) min: 4 max: 2357 x̄: 432.49 x̃: 113 HURT stats (rel) min: <.01% max: 3.44% x̄: 1.03% x̃: 0.54% 95% mean confidence interval for cycles value: -26154.16 -12881.99 95% mean confidence interval for cycles %-change: -3.20% -1.55% Cycles are helped. total spills in shared programs: 19878 -> 19293 (-2.94%) spills in affected programs: 3020 -> 2435 (-19.37%) helped: 41 HURT: 2 total fills in shared programs: 20918 -> 19875 (-4.99%) fills in affected programs: 3968 -> 2925 (-26.29%) helped: 41 HURT: 2 LOST: 0 GAINED: 1 Ivy Bridge total instructions in shared programs: 11875585 -> 11873641 (-0.02%) instructions in affected programs: 78065 -> 76121 (-2.49%) helped: 27 HURT: 0 helped stats (abs) min: 8 max: 134 x̄: 72.00 x̃: 72 helped stats (rel) min: 0.36% max: 4.23% x̄: 2.42% x̃: 2.42% 95% mean confidence interval for instructions value: -83.68 -60.32 95% mean confidence interval for instructions %-change: -2.78% -2.07% Instructions are helped. total cycles in shared programs: 178232734 -> 175769085 (-1.38%) cycles in affected programs: 50018707 -> 47555058 (-4.93%) helped: 27 HURT: 0 helped stats (abs) min: 82035 max: 99953 x̄: 91246.26 x̃: 92278 helped stats (rel) min: 4.40% max: 5.69% x̄: 4.93% x̃: 4.95% 95% mean confidence interval for cycles value: -93674.20 -88818.32 95% mean confidence interval for cycles %-change: -5.09% -4.78% Cycles are helped. total spills in shared programs: 4182 -> 3739 (-10.59%) spills in affected programs: 1089 -> 646 (-40.68%) helped: 27 HURT: 0 total fills in shared programs: 5216 -> 4345 (-16.70%) fills in affected programs: 1874 -> 1003 (-46.48%) helped: 27 HURT: 0 No changes on any earlier Intel platforms. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4156>	2020-05-07 10:55:50 -07:00
Rhys Perry	32d871b48f	nir/algebraic: don't undo lowering of 8/16-bit comparisons to 32-bit Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4387>	2020-04-23 10:57:38 +00:00
Caio Marcelo de Oliveira Filho	956e4b2d37	nir, intel: Move use_scoped_memory_barrier to nir_options This option will be used later by GLSL, so move to a common struct. Because nir_options is filled in the compiler instead of the Vulkan driver, fix that up. GLSL will ignore that for now. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3913> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3913>	2020-02-24 19:12:11 +00:00
Ian Romanick	4e9079d0c7	i965: Enable INTEL_shader_integer_functions2 on Gen8+ v2: Use new lower_hadd64 and lower_usub_sat64 flags. v3: Enable SPIR-V capability. v4: Move lowering options to COMMON_SCALAR_OPTIONS. Suggested by Caio. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>	2020-01-23 00:18:57 +00:00
Matt Turner	49c21802cb	intel/compiler: Split has_64bit_types into float/int Gen7 has 64-bit floats but not 64-bit ints. Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>	2020-01-22 00:19:20 +00:00
Kenneth Graunke	d0d28c783d	iris: Set nir_shader_compiler_options::unify_interfaces. This is technically enabling the option in the common intel backend code, but only the st/nir linker uses the option, so it's iris-only. Fixes Piglit's spec/glsl-1.50/execution/geometry/clip-distance-vs-gs-out Closes: #2274 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3249> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3249>	2020-01-03 00:41:50 +00:00
Kenneth Graunke	44754279ac	intel/fs/gen12: Use TCS 8_PATCH mode. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2019-10-11 12:24:16 -07:00
Jordan Justen	f9ec4ac5a1	intel/ir: Lower fpow on Gen12. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Marek Olšák	cebc38ff60	nir: add nir_shader_compiler_options::lower_to_scalar This will replace PIPE_SHADER_CAP_SCALAR_ISA. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-10 15:49:18 -04:00
Ian Romanick	b418269d7d	intel/compiler: Request bitfield_reverse lowering on pre-Gen7 hardware See the previous commit for the explanation of the Fixes tag. Hurts 21 shaders in shader-db. All of the hurt shaders are in Unreal Engine 4 tech demos. Reviewed-by: Matt Turner <mattst88@gmail.com> Fixes: `7afa26d4e3` ("nir: Add lowering for nir_op_bitfield_reverse.")	2019-08-28 11:39:29 -07:00
Jason Ekstrand	110669c85c	st,i965: Stop looping on 64-bit lowering Now that the 64-bit lowering passes do a complete lowering in one go, we don't need to loop anymore. We do, however, have to ensure that int64 lowering happens after double lowering because double lowering can produce int64 ops. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-16 16:05:16 +00:00
Jason Ekstrand	0ba508d7a3	nir,intel: Add support for lowering 64-bit nir_opt_extract_* We need this when doing full software 64-bit emulation. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110309 Fixes: `cbad201c2b` "nir/algebraic: Add missing 64-bit extract_[iu]8..." Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-07-15 16:08:37 -05:00
Ian Romanick	1259f6d802	nir: intel/vec4: Add flag to disable some algebraic optimizations A couple patches later in this series use the flag to avoid a few thousand shader-db regresions on all vec4 platforms. I'm not particularly enamored with the name of this flag. However, I suspect the Intel vec4 backend is the only backend that will benefit from it. Specifically, the cases where this helps are all cases where we want to prevent nir_opt_algebraic from rearranging instructions to create 3-source instructions, such as ffma and flrp, with additional immediate value or uniform sources. The earlier commit "intel/vec4: Try to emit a single load for multiple 3-src instruction operands" solves most of the problems caused by additional immediate values, but the restrictions on register strides that cause problems for uniforms and shader inputs persist. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-11 10:20:03 -07:00
Jason Ekstrand	14781e2122	intel/compiler: Add a "base class" for program keys Right now, all keys have two things in common: a program string ID and a sampler_prog_key_data. I'd like to add another thing or two and need a place to put it. This commit adds a new brw_base_prog_key struct which contains those two common bits. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-10 19:35:55 +00:00
Sagar Ghuge	1e92e83856	intel/compiler: Emit ROR and ROL instruction v2: Reorder patch (Matt Turner) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-01 10:14:22 -07:00
Kenneth Graunke	c7d1b52a2c	nir: Combine lower_fmod16/32 back into a single lower_fmod. We originally had a single lower_fmod option. In commit `2ab2d2e5`, Sam split 32 and 64-bit lowering into separate flags, with the rationale that some drivers might want different options there. This left 16-bit unhandled, so Iago added a lower_fmod16 option in commit `ca31df6f`. Now that lower_fmod64 is gone (in favor of nir_lower_doubles and nir_lower_dmod), we re-combine lower_fmod16 and lower_fmod32 into a single lower_fmod flag again. I'm not aware of any hardware which need lowering for one bitsize and not the other. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-05 16:45:12 -07:00

1 2

87 commits